NPHP nucleic acids and proteins

ABSTRACT

The present invention relates to Nephronophthisis, in particular to the NPHP4 protein (nephroretinin or nephrocystin-4) and nucleic acids encoding the NPHP4 protein. The present invention also provides assays for the detection of NPHP4, and assays for detecting nephroretinin and inversin polymorphisms and mutations associated with disease states.

[0001] The present invention claims priority to U.S. Provisional PatentApplication Serial No. 60/406,001, filed Aug. 26, 2002, the disclosureof which is hereby incorporated by reference in its entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to Nephronophthisis, in particularto the NPHP4 protein (nephroretinin or nephrocystin-4) and nucleic acidsencoding the NPHP4 protein. The present invention also provides assaysfor the detection of NPHP4, and assays for detecting nephroretinin andinversin polymorphisms and mutations associated with disease states.

BACKGROUND OF THE INVENTION

[0003] Nephronophthisis (NPHP), an autosomal recessive cystic kidneydisease, constitutes the most frequent genetic cause for end-stage renaldisease (ESRD) in children and young adults. NPHP is a progressivehereditary kidney disease marked by anemia, polyuria, renal loss ofsodium, progressing to chronic renal failure, tubular atrophy,interstitial fibrosis, glomerular sclerosis, and medullary cysts.

[0004] The most prominent histologic feature of NPHP consists of renalfibrosis, which in chronic renal failure, regardless of origin,represents the pathogenic event correlated most strongly to loss ofrenal function (Zeisberg et al., Hypertens. 10:315 [2001]). Therefore,NPHP has been considered a model disease for the development of renalfibrosis. The only treatment for NPHP is renal replacement therapy forsurvival (Smith et al., Am. J. Dis. Child. 69:369 [1945]; Fanconi etal., Helv. Paediatr. Acta. 6:1 [1951]; Hildebrandt, (1999) Juvenilenephronophthisis. In: Avner E, Holliday M, Barrat T (eds.) PediatricNephrology. Williams & Wilkins, Baltimore).

[0005] Three distinct gene loci for nephronophthisis, NPHP1 [MIM256100], NPHP2 [MIM602088], and NPHP3 [MIM 604387], have been mapped tochromosomes 2q13 (Antignac et al., Nature Genet. 3:342 [1993];Hildebrandt et al., Am J Hum Genet 53:1256-1261 [1993]), 9q22 (Haider etal., Am J Hum Genet 63:1404-1410 [1998), and 3q22 (Omran et al., Am JHum Genet 66:118-127 [2000]), respectively. These disease variants sharerenal histology of interstitial infiltrations, renal tubular cellatrophy with cyst development, and renal interstitial fibrosis (Waldherret al., Virchows Arch A Pathol Anat Histol 394:235-254 [1982]). Thevariants can be distinguished clinically by age of onset at ESRD. Renalfailure develops at median ages of 1 year, 13 years, and 19 years, inNPHP2, NPHP 1, and NPHP3, respectively (Omran et al., [2000], supra).

[0006] Clearly there is a great need for identification of the molecularbasis of NPHP, as well as for improved diagnostics and treatments forNPHP.

SUMMARY OF THE INVENTION

[0007] The present invention relates to Nephronophthisis, in particularto the NPHP4 protein (nephroretinin or nephrocystin-4) and nucleic acidsencoding the NPHP4 protein. The present invention also provides assaysfor the detection of NPHP4, and assays for detecting nephroretinin andinversin polymorphisms and mutations associated with disease states.

[0008] Accordingly, in some embodiments, the present invention providesan isolated and purified nucleic acid comprising a sequence encoding aprotein selected from the group consisting of SEQ ID NOs: 2, 6, 8, 10,12, 14, 16, 18, and 20. In some embodiments, the sequence is operablylinked to a heterologous promoter. In some embodiments, the sequence iscontained within a vector. In some embodiments, the vector is within ahost cell. In some embodiments, the present invention provides acomputer readable medium encoding a representation of the nucleic acidsequence.

[0009] The present invention also provides an isolated and purifiednucleic acid sequence that hybridizes under conditions of low stringencyto a nucleic acid selected from the group consisting of SEQ ID NOs: 1,5, 7, 9, 11, 13, 15, 17, and 19. In some embodiments, the sequence iscontained within a vector. In some embodiments, the vector is in a hostcell. In some embodiments, the host cell is located in an organism,wherein the organism is a non-human animal.

[0010] The present invention additionally provides a protein encoded bya nucleic acid selected from the group consisting of SEQ ID NOs:1 andvariants thereof that are at least 80% identical to SEQ ID NOs: 1 5, 7,9, 11, 13, 15, 17, and 19. In some embodiments, the protein is at least90%, and preferably at least 95% identical to SEQ ID NOs: 1, 5, 7, 9,11, 13, 15, 17, and 19. In some embodiments, the present inventionprovides a computer readable medium encoding a representation of thepolypeptide sequence.

[0011] The present invention further provides a composition comprising anucleic acid that inhibits the binding of at least a portion of anucleic acid selected from the group consisting of SEQ ID NOs:1, 5, 7,9, 11, 13, 15, 17, and 19 to their complementary sequences. In otherembodiments, the present invention provides a polynucleotide sequencecomprising at least fifteen nucleotides capable of hybridizing understringent conditions to the isolated nucleotide sequence.

[0012] In yet other embodiments, the present invention provides acomposition comprising a variant nephroretinin polypeptide, wherein thepolypeptide comprises a C-terminal truncation of SEQ ID NO:2. In someembodiments, the variant nephroretinin polypeptide is selected from thegroup consisting of SEQ ID NOs: 6, 10, 12, 14, 16, and 20. In someembodiments, the presence of the variant polypeptide in a subject isindicative of nephronophthisis type 4 kidney disease in the subject.

[0013] In still further embodiments, the present invention provides amethod for detection of a variant nephroretinin polypeptide in asubject, comprising: providing a biological sample from a subject,wherein the biological sample comprises a nephroretinin polypeptide; anddetecting the presence or absence of a variant nephroretinin polypeptidein the biological sample. In some embodiments, the variant nephroretininpolypeptide is a C-terminal truncation of SEQ ID NO:2. In someembodiments, the variant nephroretinin polypeptide is selected from thegroup consisting of SEQ ID NOs: 6, 10, 12, 14, 16, and 20. In someembodiments, the presence of the variant nephroretinin polypeptide isindicative of nephronophthisis type 4 kidney disease in the subject. Insome embodiments, the biological sample is selected from the groupconsisting of a blood sample, a tissue sample, a urine sample, and anamniotic fluid sample. In some embodiments, the subject is selected fromthe group consisting of an embryo, a fetus, a newborn animal, and ayoung animal. In some embodiments, the animal is a human. In someembodiments, the detecting comprises differential antibody binding. Inother embodiments, the detecting comprises a gel-free truncation test.In still other embodiments, the detection comprises a Western blot.

[0014] The present invention further provides a kit comprising a reagentfor detecting the presence or absence of a variant nephroretininpolypeptide in a biological sample. In some embodiments, the kit furthercomprises instruction for using the kit for detecting the presence orabsence of a variant nephroretinin polypeptide in a biological sample.In some embodiments, the instructions comprise instructions required bythe U.S. Food and Drug Agency for in vitro diagnostic kits. In someembodiments, the kit further comprises instructions for diagnosingnephronophthisis in the subject based on the presence or absence of thevariant nephroretinin polypeptide. In some embodiments, thenephronophthisis is nephronophthisis type 4. In some embodiments, thereagent is one or more antibodies. In some embodiments, the antibodiescomprise a first antibody that specifically binds to the C-terminus ofthe nephroretinin polypeptide and a second antibody that specificallybinds to the N-terminus of the nephroretinin polypeptide. In otherembodiments, the reagents comprise reagents for performing a gel-freetruncation test. In some embodiments, the variant nephroretininpolypeptide is a C-terminal truncation of SEQ ID NO:2, for example, insome embodiments, the variant nephroretinin polypeptide is selected fromthe group consisting of SEQ ID NOs: 6, 10, 12, 14, 16, and 20. In someembodiments, the biological sample is selected from the group consistingof a blood sample, a tissue sample, a urine sample, and an amnioticfluid sample.

[0015] In still further embodiments, the present invention provides amethod for detection of a variant inversin polypeptide in a subject,comprising: providing a biological sample from a subject, wherein thebiological sample comprises a inversin polypeptide; and detecting thepresence or absence of a variant inversin polypeptide in the biologicalsample. In some embodiments, the variant inversin polypeptide is aC-terminal truncation of SEQ ID NO:22. In some embodiments, the variantinversin polypeptide is selected from the group consisting of SEQ IDNOs: 24, 26, 28, 30, 34, 36, 38 and 40. In some embodiments, thepresence of the variant inversin polypeptide is indicative ofnephronophthisis type 2 kidney disease in the subject. In someembodiments, the biological sample is selected from the group consistingof a blood sample, a tissue sample, a urine sample, and an amnioticfluid sample. In some embodiments, the subject is selected from thegroup consisting of an embryo, a fetus, a newborn animal, and a younganimal. In some embodiments, the animal is a human. In some embodiments,the detecting comprises differential antibody binding. In otherembodiments, the detecting comprises a gel-free truncation test. Instill other embodiments, the detection comprises a Western blot.

[0016] The present invention also provides a kit comprising a reagentfor detecting the presence or absence of a variant inversin polypeptideor nucleic acid in a biological sample. In further embodiments, the kitfurther comprises reagents for detecting the presence or absence of avariant nephroretinin polypeptide or nucleic acid, or a variantnephrocystin-3 polypeptide or nucleic acid. In some embodiments, the kitfurther comprises instruction for using the kit for detecting thepresence or absence of a variant inversin polypeptide or nucleic acid ina biological sample. In some embodiments, the instructions compriseinstructions required by the U.S. Food and Drug Agency for in vitrodiagnostic kits. In some embodiments, the kit further comprisesinstructions for diagnosing nephronophthisis in the subject based on thepresence or absence of the variant inversin polypeptide or nucleic acid.In some embodiments, the kit further comprises instructions fordiagnosing nephronophthisis in the subject based on the presence orabsence of the variant inversin polypeptide or nucleic acid, the variantnephroretinin polypeptide or nucleic acid, or the variant nephrocystin-3polypeptide or nucleic acid. In some embodiments, the nephronophthisisis nephronophthisis type 2. In other embodiments, the nephronophthisisis nephronophthisis type 2, nephronophthisis type 4, or nephronophthisistype 3. In some embodiments, the reagent is one or more antibodies. Insome embodiments, the antibodies comprise a first antibody thatspecifically binds to the C-terminus of the inversin polypeptide and asecond antibody that specifically binds to the N-terminus of theinversin polypeptide. In other embodiments, the reagents comprisereagents for performing a gel-free truncation test. In some embodiments,the variant inversin polypeptide is a C-terminal truncation of SEQ IDNO:22, for example, in some embodiments, the variant inversinpolypeptide is selected from the group consisting of SEQ ID NOs: 24, 26,28, 30, 34, 36, 38 and 40. In some embodiments, the biological sample isselected from the group consisting of a blood sample, a tissue sample, aurine sample, and an amniotic fluid sample.

DESCRIPTION OF THE FIGURES

[0017]FIG. 1 shows haplotype results on chromosome 1p36 carried out forrefining the NPHP4 locus in affected offspring from 3 consanguineousNPHP families. p-ter, telomeric; cen, centromeric; nd, not done.

[0018]FIG. 2 shows the positional cloning strategy for the NPHP4 gene onhuman chromosome 1p36. FIG. 2A, genetic map position for microsatellitesused in linkage mapping of NPHP4 (see FIG. 1). Published flankingmarkers are underlined (Schuermann et al., Am. J. Hum. Genet. 70:1240[2002]. p-ter, telomeric; cen, centromeric. FIG. 2B, physical mapdistances of critical microsatellites relative to D1S2660. The secure1.2 Mb critical interval (solid bar) and the 700 kb suggestive criticalinterval (stippled bar), are shown delimited by the newly identifiedsecure flanking markers (asterisks) and suggestive flanking markers(double asterisks) defined by haplotype analysis (see FIG. 1). Below theaxis known genes, predicted unkown genes, and the NPHP4 gene (aliasQ9UFQ2) are represented as arrows in the direction of transcription.FIG. 2C, genomic organization of NPHP4 with exons indicated as verticalhatches and numbered. FIG. 2D, exon structure of NPHP4 cDNA. Black andwhite boxes represent the 30 exons encoding nephroretinin. The number ofthe first codon of each exon is indicated; exons beginning with thesecond or third base of a codon are indicated by “b” or “c”,respectively. At the bottom locations of the 1 I different mutationsidentified in 8 NPHP kindred are shown. fs, frameshift. FIG. 2E, NPHP4mutations occurring homozygously in affecteds of 5 consanguineousfamilies (underlined). Mutated nucleotides and altered amino acids aredepicted on grey background.

[0019]FIG. 3 shows Northern blot analysis of the NPHP4 expressionpattern. Expression of a 5.9 kb transcript (arrowhead) is apparent inall tissues studied with highest expression in skeletal muscle.

[0020]FIG. 4 shows the nucleic acid (cDNA) (SEQ ID NO: 1) and amino acid(SEQ ID NO: 2) sequences of NPHP4.

[0021]FIG. 5 shows an alignment of human (SEQ ID NO: 2), mouse (SEQ IDNO: 3), and C. elegans (SEQ ID NO: 4) NPHP4 amino acid sequences.

[0022]FIG. 6 shows the nucleic acid (SEQ ID NO: 5) and amino acid (SEQID NO: 6) sequences of an exemplary NPHP4 variant found in family 3 (SeeTable 1).

[0023]FIG. 7 shows the nucleic acid (SEQ ID NO: 7) and amino acid (SEQID NO:8) sequences of an exemplary NPHP4 variant found in family 24 (SeeTable 1).

[0024]FIG. 8 shows the nucleic acid (SEQ ID NO: 9) and amino acid (SEQID NO:10) sequences of an exemplary NPHP4 variant found in family 30(See Table 1).

[0025]FIG. 9 shows the nucleic acid (SEQ ID NO: 11) and amino acid (SEQID NO:12) sequences of an exemplary NPHP4 variant found in family 32(See Table 1).

[0026]FIG. 10 shows the nucleic acid (SEQ ID NO: 13) and amino acid (SEQID NO:14) sequences of an exemplary NPHP4 variant found in family 60(See Table 1).

[0027]FIG. 11 shows the nucleic acid (SEQ ID NO: 15) and amino acid (SEQID NO: 16) sequences of an exemplary NPHP4 variant found in family 461(See Table 1).

[0028]FIG. 12 shows the nucleic acid (SEQ ID NO: 17) and amino acid (SEQID NO: 18) sequences of an additional exemplary NPHP4 variant found infamily 461 (See Table 1).

[0029]FIG. 13 shows the nucleic acid (SEQ ID NO: 19) and amino acid (SEQID NO:20) sequences of an exemplary NPHP4 variant found in family 622(See Table 1).

[0030]FIG. 14 shows the nucleic acid (cDNA) (SEQ ID NO: 21) and aminoacid (SEQ ID NO: 22) sequences of inversin.

[0031]FIG. 15 shows mutations in INVS in individuals with NPHP2. FIGS.2a and 2 d show mutations in INVS (nucleotide exchange and amino acidexchange) together with sequence traces for mutated sequences (top) andsequence from healthy controls (bottom). Family numbers are given aboveboxes. FIG. 2b shows the exon structure of INVS. FIG. 2c shows arepresentation of protein motifs found in inversin. aa, amino acidresidues; Ank, ankyrin/swi6 motif; D1, D box1 (Apc2-binding²³); D2, Dbox2; IQ, calmodulin binding domains.

[0032]FIG. 16 depicts the specific nucleotide exchange (SEQ ID NO: 23)and resulting termination of the amino acid sequence (SEQ ID NO: 24) ofan exemplary inversin variant found in family A6 (See Table 3).

[0033]FIG. 17 depicts a specific nucleotide deletion (SEQ ID NO: 25) andresulting termination of the amino acid sequence (SEQ ID NO: 26) of anexemplary inversin variant found in family A6 (See Table 3).

[0034]FIG. 18 depicts the specific nucleotide exchange (SEQ ID NO: 27)and resulting termination of the amino acid sequence (SEQ ID NO: 28) ofan exemplary inversin variant found in family A8 (See Table 3).

[0035]FIG. 19 depicts the specific nucleotide exchange (SEQ ID NO: 29)and resulting termination of the amino acid sequence (SEQ ID NO: 30) ofan exemplary inversin variant found in family A9 (See Table 3).

[0036]FIG. 20 depicts the specific nucleotide exchange (SEQ ID NO: 31)and resulting substitution in the amino acid sequence (SEQ ID NO: 32) ofan exemplary inversin variant found in family A9 (See Table 3).

[0037]FIG. 21 depicts a specific nucleotide deletion (SEQ ID NO: 33) andresulting termination of the amino acid sequence (SEQ ID NO: 34) of anexemplary inversin variant found in family A10 (See Table 3).

[0038]FIG. 22 depicts the specific nucleotide exchange (SEQ ID NO: 35)and resulting termination of the amino acid sequence (SEQ ID NO: 36) ofan exemplary inversin variant found in family A12 (See Table 3).

[0039]FIG. 23 depicts the specific nucleotide exchange (SEQ ID NO: 37)and resulting termination of the amino acid sequence (SEQ ID NO: 38) ofan exemplary inversin variant found in family 868 (See Table 3).

[0040]FIG. 24 depicts a specific nucleotide insertion (SEQ ID NO: 39)and resulting termination of the amino acid sequence (SEQ ID NO: 40) ofan exemplary inversin variant found in family 868 (See Table 3).

[0041]FIG. 25 depicts the specific nucleotide exchange (SEQ ID NO: 41)and resulting substitution in the amino acid sequence (SEQ ID NO: 42) ofan exemplary inversin variant found in family A7 (See Table 3).

[0042]FIG. 26 shows the association of inversin with nephrocystin in HEK293T cells and in mouse tissue.

[0043]FIG. 27 shows the molecular interaction of nephrocystin withβ-tubulin.

[0044]FIG. 28 shows the co-localization of nephrocystin and inversin toprimary cilia in renal tubular epithelial cells.

[0045]FIG. 29 shows the disruption of zebrafish invs function results inrenal cyst formation.

GENERAL DESCRIPTION OF THE INVENTION

[0046] The gene for nephronophthisis type 1 (NPHP1) has been cloned bypositional cloning (Hildebrandt et al., Nature Genet 17:149-153 [1997]).Its gene product, nephrocystin, represents a novel docking protein,which interacts with the signaling proteins p130Cas, tensin, focaladhesion kinase 2, and filamin A and B, which are involved in cell-celland cell-matrix signaling of renal epithelial cells (Hildebrandt andOtto, J Am Soc Nephrol 11:1753-1761 [2000]; Donaldson et al., Exp CellRes 256:168-178 [2000]; Benzing et al., Proc Natl Acad Sci USA98:9784-9789 [2001]; Donaldson et al., J Biol Chem 277:29028-29035[2002]). The association of NPHP with autosomal recessive retinitispigmentosa (RP), has been described as the so-called Senior-Lkensyndrome (SLS [MIM 266900]) (Senior et al., Am J Ophthalmol 52:625-633[1961]; Lken et al., Acta Paediatr 50:177-184 [1961]; each of which isherein incorporated by reference). In families with SLS, linkage hasbeen demonstrated to the loci for NPHP1 and NPHP3 (Caridi et al., Am JKidney Dis 32:1059-1062 [1998]; Omran et al., 2002, supra). Veryrecently, a new gene locus (NPHP4) for NPHP type 4 (Schuermann et al.,Am. J. Hum. Genet. 70:1240 [2002]; herein incorporated by reference) hasbeen identified and linkage of a large SLS kindred to this locusdemonstrated.

[0047] Experiments conducted during the course of development of thepresent invention identified, by positional cloning, the gene (NPHP4)causing NPHP type 4, through demonstration of 9 likely loss-of-functionmutations in 6 affected families. In addition, 2 loss of functionmutations in patients from 2 families with SLS were detected. Theconclusion that the gene cloned in the experiments described herein isthe gene causing NPHP type 4 is based on identification, in 8 familieswith NPHP, of 9 distinct truncating mutations and 2 missense mutations,none of which occurred in over 92 healthy control individuals.Experiments conducted during the course of development of the presentinvention further demonstrated the presence of 2 homozygous truncatingmutations also in 2 families with SLS (F3 and F60). A small percentageof patients also exhibit SLS in families with NPHP1 mutations (Caridi etal., Am. J. Kidney Disease 32:1059 [1998]) and in families linked toNPHP3 (Omran et al. 2002, surpa). For all 3 genes no distinction can bemade on the basis of allelic differences between the NPHP phenotypeswith and without RP. Therefore, it seems likely that a stochasticpleiotropic effect is responsible for the occurrence of RP in NPHP types1, 3 and 4. Accordingly, in some embodiments, the present inventionprovides the NPHP4 nucleic acid and amino acid sequence, as well asdisease related variants therof.

[0048] NPHP4 is a novel gene, which is unrelated to any known genefamilies. It encodes a novel protein, “nephroretinin” or“nephrocystin-4”. NPHP4, like NPHP1, is unique to the human genome, isconserved in C. elegans, and exhibits a broad expression pattern.Identification of the NPHP1 gene (Hildebrandt et al., Nature Genet.17:149 [1997]) revealed nephrocystin as a novel docking protein, whichinteracts with p130Cas (Donaldson et al., Exp. Cell. Res. 256:168[2000]; Hildebrandt and Otto, J. Am. Soc. Nephrol. 11:1753 [2000]),tensin, focal adhesion kinase 2 (Benzing et al., PNAS 98:9784 [2001]),and filamin A and B (Donaldson et al., 2002, supra), and which isinvolved in cell-cell and cell-matrix signaling. The present inventionis not limited to a particular mechanism of action. Indeed, anunderstanding of the mechanism is not necessary to practice the presentinvention. Nonetheless, it is therefore likely that both nephroretininand nephrocystin, interact within a novel shared pathogenic pathway.Thus, the present invention provides a novel gene with critical roles inrenal tissue architecture and ophthalmic function.

[0049] Two additional gene loci have been mapped for NPHP. The locusNPHP3 associated with adolescent NPHP localizes to human chromosome 3q22(Omran, et al., Am. J. Hum. Genet. 66, 118 [2000]), and NPHP2 associatedwith infantile NPHP resides on chromosome 9q21-q22 (Haider et al., Am.J. Hum. Genet. 63, 1404 [1998]). The kidney phenotype of NPHP2 combinesfeatures of NPHP, including tubular basement membrane disruption andrenal interstitial fibrosis, with features of PKD (Gagnadoux et al.,Pediatr. Nephrol. 3, 50 [1989]) including enlarged kidneys andwidespread cyst development. During the course of development of thepresent invention, the human gene INVS was determined to be located inthe NPHP2 critical genetic interval (Haider et al., Am. J. Hum. Genet.63, 1404 [1998]).

[0050] In the inv/inv mouse model of insertional mutagenesis, a deletionof exons 3-11 of Invs encoding inversin causes a phenotype of cystformation in enlarged kidneys, situs inversus and pancreatic islet celldysplasia (Mochizuki et al., Nature 395, 177 [1998]; Morgan et al., Nat.Genet. 20, 149 [1998]). Histology of infantile NPHP2 and of the inv/invmouse identified features resembling NPHP, namely interstitial fibrosis,mild interstitial cell infiltration, tubular cell atrophy, tubular cystsand periglomerular fibrosis. In addition, human NPHP2 and mouse inv/invphenotypes showed features reminiscent of autosomal dominant PKD, suchas kidney enlargement, absence of the tubular basement membraneirregularity characteristic of NPHP and presence of cysts also outsidethe medullary region.

[0051] Experiments conducted during the course of development of thepresent invention identified the gene (INVS) causing NPHP type 2,through demonstration of 8 likely loss-of-function mutations in 6affected families. The conclusion that the gene identified in theexperiments described herein is the gene causing NPHP type 2 is based onidentification, in 7 families with NPHP, of 8 distinct truncatingmutations and 2 missense mutations, none of which occurred in over 100healthy control individuals.

Definitions

[0052] To facilitate understanding of the invention, a number of termsare defined below.

[0053] As used herein, the term “NPHP4” or “nephroretinin” or“nephrocystin-4” when used in reference to a protein or nucleic acidrefers to a protein or nucleic acid encoding a protein that, in somemutant forms, is correlated with nephronophthisis. The term NPHP4encompasses both proteins that are identical to wild-type NPHP4 andthose that are derived from wild type NPHP4 (e.g., variants of NPHP4 orchimeric genes constructed with portions of NPHP4 coding regions). Insome embodiments, the “NPHP4” is the wild type nucleic acid (SEQ IDNO: 1) or amino acid (SEQ ID NO:2) sequence. In other embodiments, the“NPHP4” is a variant or mutant (e.g., including, but not limited to, thenucleic acid sequences described by SEQ ID NOS: 5, 7, 9, 11, 13, 15, 17,19 and the amino acid sequences described by SEQ ID NOS: 6, 8, 10, 12,14, 16, 18, and 20).

[0054] As used herein, the term “INVS” or “inversin” when used inreference to a protein or nucleic acid refers to a protein or nucleicacid encoding a protein that, in some mutant forms, is correlated withnephronophthisis. In some embodiments, the “inversin” is the wild typenucleic acid (SEQ ID NO: 21) or amino acid (SEQ ID NO:22) sequence. Inother embodiments, the “inversin” is a variant or mutant (e.g.,including, but not limited to, the nucleic acid sequences described bySEQ ID NOS: 23, 25, 27, 29, 31, 33, 35, 37, and 39 and the amino acidsequences described by SEQ ID NOS: 24, 26, 28, 30, 32, 34, 36, 38 and40).

[0055] As used herein, the term “C-terminal truncation of SEQ ID NO:2refers to a polypeptide comprising a portion of SEQ ID NO:2, wherein theportion comprises the N-terminus of SEQ ID NO:2. In preferredembodiments, the N-terminal portion comprises at lease 200 amino acids,preferably at least 400 amino acids, and even more preferably at least700 amino acids of SEQ ID NO:2. Exemplary C-terminal truncations of SEQID NO:2 include, but are not limited to, SEQ ID NOs: 6, 10, 12, 14, 16,and 20, and the term “C-terminal truncation of SEQ ID NO:22 refers to apolypeptide comprising a portion of SEQ ID NO:22, wherein the portioncomprises the N-terminus of SEQ ID NO:22. In preferred embodiments, theN-terminal portion comprises at lease 200 amino acids, preferably atleast 400 amino acids, and even more preferably at least 700 amino acidsof SEQ ID NO:22. Exemplary C-terminal truncations of SEQ ID NO:22include, but are not limited to, SEQ ID NOs: 24, 26, 28, 30, 34, 36, 38and 40.

[0056] As used herein, the terms “instructions for using said kit forsaid detecting the presence or absence of a variant nephroretininpolypeptide in a said biological sample” or “instructions for using saidkit for said detecting the presence or absence of a variant inversinpolypeptide in a said biological sample” includes instructions for usingthe reagents contained in the kit for the detection of variant and wildtype nephroretinin and inversin polypeptides, respectfully. In someembodiments, the instructions further comprise the statement of intendeduse required by the U.S. Food and Drug Administration (FDA) in labelingin vitro diagnostic products. The FDA classifies in vitro diagnostics asmedical devices and requires that they be approved through the 510(k)procedure. Information required in an application under 510(k)includes: 1) The in vitro diagnostic product name, including the tradeor proprietary name, the common or usual name, and the classificationname of the device; 2) The intended use of the product; 3) Theestablishment registration number, if applicable, of the owner oroperator submitting the 510(k) submission; the class in which the invitro diagnostic product was placed under section 513 of the FD&C Act,if known, its appropriate panel, or, if the owner or operator determinesthat the device has not been classified under such section, a statementof that determination and the basis for the determination that the invitro diagnostic product is not so classified; 4)Proposed labels,labeling and advertisements sufficient to describe the in vitrodiagnostic product, its intended use, and directions for use. Whereapplicable, photographs or engineering drawings should be supplied; 5) Astatement indicating that the device is similar to and/or different fromother in vitro diagnostic products of comparable type in commercialdistribution in the U.S., accompanied by data to support the statement;6) A 510(k) summary of the safety and effectiveness data upon which thesubstantial equivalence determination is based; or a statement that the510(k) safety and effectiveness information supporting the FDA findingof substantial equivalence will be made available to any person within30 days of a written request; 7) A statement that the submitterbelieves, to the best of their knowledge, that all data and informationsubmitted in the premarket notification are truthful and accurate andthat no material fact has been omitted; 8) Any additional informationregarding the in vitro diagnostic product requested that is necessaryfor the FDA to make a substantial equivalency determination. Additionalinformation is available at the Internet web page of the U.S. FDA.

[0057] The term “gene” refers to a nucleic acid (e.g., DNA) sequencethat comprises coding sequences necessary for the production of apolypeptide, RNA (e.g., including but not limited to, mRNA, tRNA andrRNA) or precursor (e.g., NPHP4). The polypeptide, RNA, or precursor canbe encoded by a full length coding sequence or by any portion of thecoding sequence so long as the desired activity or functional properties(e.g., enzymatic activity, ligand binding, signal transduction, etc.) ofthe full-length or fragment are retained. The term also encompasses thecoding region of a structural gene and the including sequences locatedadjacent to the coding region on both the 5′ and 3′ ends for a distanceof about 1 kb on either end such that the gene corresponds to the lengthof the full-length mRNA. The sequences that are located 5′ of the codingregion and which are present on the mRNA are referred to as 5′untranslated sequences. The sequences that are located 3′ or downstreamof the coding region and that are present on the mRNA are referred to as3′ untranslated sequences. The term “gene” encompasses both cDNA andgenomic forms of a gene. A genomic form or clone of a gene contains thecoding region interrupted with non-coding sequences termed “introns” or“intervening regions” or “intervening sequences.” Introns are segmentsof a gene that are transcribed into nuclear RNA (hnRNA), introns maycontain regulatory elements such as enhancers. Introns are removed or“spliced out” from the nuclear or primary transcript; introns thereforeare absent in the messenger RNA (mRNA) transcript. The mRNA functionsduring translation to specify the sequence or order of amino acids in anascent polypeptide.

[0058] In particular, the term “NPHP4 gene” refers to the full-lengthNPHP4 nucleotide sequence (e.g., contained in SEQ ID NO: 1). However, itis also intended that the term encompass fragments of the NPHP4sequence, mutants (e.g., SEQ ID NOS: 5, 7, 9, 11, 13, 15, 17, 21, 23,and 25) as well as other domains within the full-length NPHP4 nucleotidesequence. Furthermore, the terms “NPHP4 nucleotide sequence” or “NPHP4polynucleotide sequence” encompasses DNA, cDNA, and RNA (e.g., mRNA)sequences.

[0059] Where “amino acid sequence” is recited herein to refer to anamino acid sequence of a naturally occurring protein molecule, “aminoacid sequence” and like terms, such as “polypeptide” or “protein” arenot meant to limit the amino acid sequence to the complete, native aminoacid sequence associated with the recited protein molecule.

[0060] In addition to containing introns, genomic forms of a gene mayalso include sequences located on both the 5′ and 3′ end of thesequences that are present on the RNA transcript. These sequences arereferred to as “flanking” sequences or regions (these flanking sequencesare located 5′ or 3′ to the non-translated sequences present on the mRNAtranscript). The 5′ flanking region may contain regulatory sequencessuch as promoters and enhancers that control or influence thetranscription of the gene. The 3′ flanking region may contain sequencesthat direct the termination of transcription, post-transcriptionalcleavage and polyadenylation.

[0061] The term “wild-type” refers to a gene or gene product that hasthe characteristics of that gene or gene product when isolated from anaturally occurring source. A wild-type gene is that which is mostfrequently observed in a population and is thus arbitrarily designed the“normal” or “wild-type” form of the gene. In contrast, the terms“modified,” “mutant,” “polymorphism,” and “variant” refer to a gene orgene product that displays modifications in sequence and/or functionalproperties (i.e., altered characteristics) when compared to thewild-type gene or gene product. It is noted that naturally-occurringmutants can be isolated; these are identified by the fact that they havealtered characteristics when compared to the wild-type gene or geneproduct.

[0062] As used herein, the terms “nucleic acid molecule encoding,” “DNAsequence encoding,” and “DNA encoding” refer to the order or sequence ofdeoxyribonucleotides along a strand of deoxyribonucleic acid. The orderof these deoxyribonucleotides determines the order of amino acids alongthe polypeptide (protein) chain. The DNA sequence thus codes for theamino acid sequence.

[0063] DNA molecules are said to have “5′ ends” and “3′ ends” becausemononucleotides are reacted to make oligonucleotides or polynucleotidesin a manner such that the 5′ phosphate of one mononucleotide pentosering is attached to the 3′ oxygen of its neighbor in one direction via aphosphodiester linkage. Therefore, an end of an oligonucleotides orpolynucleotide, referred to as the “5′ end” if its 5′ phosphate is notlinked to the 3′ oxygen of a mononucleotide pentose ring and as the “3′end” if its 3′ oxygen is not linked to a 5′ phosphate of a subsequentmononucleotide pentose ring. As used herein, a nucleic acid sequence,even if internal to a larger oligonucleotide or polynucleotide, also maybe said to have 5′ and 3′ ends. In either a linear or circular DNAmolecule, discrete elements are referred to as being “upstream” or 5′ ofthe “downstream” or 3′ elements. This terminology reflects the fact thattranscription proceeds in a 5′ to 3′ fashion along the DNA strand. Thepromoter and enhancer elements that direct transcription of a linkedgene are generally located 5′ or upstream of the coding region. However,enhancer elements can exert their effect even when located 3′ of thepromoter element and the coding region. Transcription termination andpolyadenylation signals are located 3′ or downstream of the codingregion.

[0064] As used herein, the terms “an oligonucleotide having a nucleotidesequence encoding a gene” and “polynucleotide having a nucleotidesequence encoding a gene,” means a nucleic acid sequence comprising thecoding region of a gene or, in other words, the nucleic acid sequencethat encodes a gene product. The coding region may be present in a cDNA,genomic DNA, or RNA form. When present in a DNA form, theoligonucleotide or polynucleotide may be single-stranded (i.e., thesense strand) or double-stranded. Suitable control elements such asenhancers/promoters, splice junctions, polyadenylation signals, etc. maybe placed in close proximity to the coding region of the gene if neededto permit proper initiation of transcription and/or correct processingof the primary RNA transcript. Alternatively, the coding region utilizedin the expression vectors of the present invention may containendogenous enhancers/promoters, splice junctions, intervening sequences,polyadenylation signals, etc. or a combination of both endogenous andexogenous control elements.

[0065] As used herein, the term “regulatory element” refers to a geneticelement that controls some aspect of the expression of nucleic acidsequences. For example, a promoter is a regulatory element thatfacilitates the initiation of transcription of an operably linked codingregion. Other regulatory elements include splicing signals,polyadenylation signals, termination signals, etc.

[0066] As used herein, the terms “complementary” or “complementarity”are used in reference to polynucleotides (i.e., a sequence ofnucleotides) related by the base-pairing rules. For example, for thesequence 5′-“A-G-T-3′,” is complementary to the sequence 3′-“T-C-A-5′.”Complementarity may be “partial,” in which only some of the nucleicacids′ bases are matched according to the base pairing rules. Or, theremay be “complete” or “total” complementarity between the nucleic acids.The degree of complementarity between nucleic acid strands hassignificant effects on the efficiency and strength of hybridizationbetween nucleic acid strands. This is of particular importance inamplification reactions, as well as detection methods that depend uponbinding between nucleic acids.

[0067] The term “homology” refers to a degree of complementarity. Theremay be partial homology or complete homology (i.e., identity). Apartially complementary sequence is one that at least partially inhibitsa completely complementary sequence from hybridizing to a target nucleicacid and is referred to using the functional term “substantiallyhomologous.” The term “inhibition of binding,” when used in reference tonucleic acid binding, refers to inhibition of binding caused bycompetition of homologous sequences for binding to a target sequence.The inhibition of hybridization of the completely complementary sequenceto the target sequence may be examined using a hybridization assay(Southern or Northern blot, solution hybridization and the like) underconditions of low stringency. A substantially homologous sequence orprobe will compete for and inhibit the binding (i.e., the hybridization)of a completely homologous to a target under conditions of lowstringency. This is not to say that conditions of low stringency aresuch that non-specific binding is permitted; low stringency conditionsrequire that the binding of two sequences to one another be a specific(i.e., selective) interaction. The absence of nonspecific binding may betested by the use of a second target that lacks even a partial degree ofcomplementarity (e.g., less than about 30% identity); in the absence ofnon-specific binding the probe will not hybridize to the secondnon-complementary target.

[0068] The art knows well that numerous equivalent conditions may beemployed to comprise low stringency conditions; factors such as thelength and nature (DNA, RNA, base composition) of the probe and natureof the target (DNA, RNA, base composition, present in solution orimmobilized, etc.) and the concentration of the salts and othercomponents (e.g., the presence or absence of formamide, dextran sulfate,polyethylene glycol) are considered and the hybridization solution maybe varied to generate conditions of low stringency hybridizationdifferent from, but equivalent to, the above listed conditions. Inaddition, the art knows conditions that promote hybridization underconditions of high stringency (e.g., increasing the temperature of thehybridization and/or wash steps, the use of formamide in thehybridization solution, etc.). Furthermore, when used in reference to adouble-stranded nucleic acid sequence such as a cDNA or genomic clone,the term “substantially homologous” refers to any probe that canhybridize to either or both strands of the double-stranded nucleic acidsequence under conditions of low stringency as described above.

[0069] A gene may produce multiple RNA species that are generated bydifferential splicing of the primary RNA transcript. cDNAs that aresplice variants of the same gene will contain regions of sequenceidentity or complete homology (representing the presence of the sameexon or portion of the same exon on both cDNAs) and regions of completenon-identity (for example, representing the presence of exon “A” on cDNA1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAscontain regions of sequence identity they will both hybridize to a probederived from the entire gene or portions of the gene containingsequences found on both cDNAs; the two splice variants are thereforesubstantially homologous to such a probe and to each other.

[0070] When used in reference to a single-stranded nucleic acidsequence, the term “substantially homologous” refers to any probe thatcan hybridize (i.e., it is the complement of) the single-strandednucleic acid sequence under conditions of low stringency as describedabove.

[0071] As used herein, the term “competes for binding” is used inreference to a first polypeptide with an activity which binds to thesame substrate as does a second polypeptide with an activity, where thesecond polypeptide is a variant of the first polypeptide or a related ordissimilar polypeptide. The efficiency (e.g., kinetics orthermodynamics) of binding by the first polypeptide may be the same asor greater than or less than the efficiency substrate binding by thesecond polypeptide. For example, the equilibrium binding constant(K_(D)) for binding to the substrate may be different for the twopolypeptides. The term “K_(m)” as used herein refers to theMichaelis-Menton constant for an enzyme and is defined as theconcentration of the specific substrate at which a given enzyme yieldsone-half its maximum velocity in an enzyme catalyzed reaction.

[0072] As used herein, the term “hybridization” is used in reference tothe pairing of complementary nucleic acids. Hybridization and thestrength of hybridization (i.e., the strength of the association betweenthe nucleic acids) is impacted by such factors as the degree ofcomplementary between the nucleic acids, stringency of the conditionsinvolved, the T_(m) of the formed hybrid, and the G:C ratio within thenucleic acids.

[0073] As used herein, the term “T_(m)” is used in reference to the“melting temperature.” The melting temperature is the temperature atwhich a population of double-stranded nucleic acid molecules becomeshalf dissociated into single strands. The equation for calculating theT_(m) of nucleic acids is well known in the art. As indicated bystandard references, a simple estimate of the T_(m) value may becalculated by the equation: T_(m)=81.5+0.41(% G+C), when a nucleic acidis in aqueous solution at 1 M NaCl (See e.g., Anderson and Young,Quantitative Filter Hybridization, in Nucleic Acid Hybridization[1985]). Other references include more sophisticated computations thattake structural as well as sequence characteristics into account for thecalculation of T_(m).

[0074] As used herein the term “stringency” is used in reference to theconditions of temperature, ionic strength, and the presence of othercompounds such as organic solvents, under which nucleic acidhybridizations are conducted. Those skilled in the art will recognizethat “stringency” conditions may be altered by varying the parametersjust described either individually or in concert. With “high stringency”conditions, nucleic acid base pairing will occur only between nucleicacid fragments that have a high frequency of complementary basesequences (e.g., hybridization under “high stringency” conditions mayoccur between homologs with about 85-100% identity, preferably about70-100% identity). With medium stringency conditions, nucleic acid basepairing will occur between nucleic acids with an intermediate frequencyof complementary base sequences (e.g., hybridization under “mediumstringency” conditions may occur between homologs with about 50-70%identity). Thus, conditions of “weak” or “low” stringency are oftenrequired with nucleic acids that are derived from organisms that aregenetically diverse, as the frequency of complementary sequences isusually less.

[0075] “High stringency conditions” when used in reference to nucleicacid hybridization comprise conditions equivalent to binding orhybridization at 42 C in a solution consisting of 5×SSPE (43.8 g/l NaCl,6.9 g/l NaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH),0.5% SDS, 5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNAfollowed by washing in a solution comprising 0.1×SSPE, 1.0% SDS at 42 Cwhen a probe of about 500 nucleotides in length is employed.

[0076] “Medium stringency conditions” when used in reference to nucleicacid hybridization comprise conditions equivalent to binding orhybridization at 42 C in a solution consisting of 5×SSPE (43.8 g/l NaCl,6.9 g/l NaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH),0.5% SDS, 5× Denhardt's reagent and 100 μg/ml denatured salmon sperm DNAfollowed by washing in a solution comprising 1.0×SSPE, 1.0% SDS at 42 Cwhen a probe of about 500 nucleotides in length is employed.

[0077] “Low stringency conditions” comprise conditions equivalent tobinding or hybridization at 42 C in a solution consisting of 5 SSPE(43.8 g/l NaCl, 6.9 g/l NaH₂PO₄ H₂O and 1.85 g/l EDTA, pH adjusted to7.4 with NaOH), 0.1% SDS, 5× Denhardt's reagent [50× Denhardt's containsper 500 ml: 5 g Ficoll (Type 400, Pharamcia), 5 g BSA (Fraction V;Sigma)] and 100 μg/ml denatured salmon sperm DNA followed by washing ina solution comprising 5× SSPE, 0.1% SDS at 42 C when a probe of about500 nucleotides in length is employed. . The present invention is notlimited to the hybridization of probes of about 500 nucleotides inlength. The present invention contemplates the use of probes betweenapproximately 10 nucleotides up to several thousand (e.g., at least5000) nucleotides in length.

[0078] One skilled in the relevant understands that stringencyconditions may be altered for probes of other sizes (See e.g., Andersonand Young, Quantitative Filter Hybridization, in Nucleic AcidHybridization [1985] and Sambrook et al., Molecular Cloning: ALaboratory Manual, Cold Spring Harbor Press, NY [1989]).

[0079] The following terms are used to describe the sequencerelationships between two or more polynucleotides: “reference sequence”,“sequence identity”, “percentage of sequence identity”, and “substantialidentity”. A “reference sequence” is a defined sequence used as a basisfor a sequence comparison; a reference sequence may be a subset of alarger sequence, for example, as a segment of a full-length cDNAsequence given in a sequence listing or may comprise a complete genesequence. Generally, a reference sequence is at least 20 nucleotides inlength, frequently at least 25 nucleotides in length, and often at least50 nucleotides in length. Since two polynucleotides may each (1)comprise a sequence (i.e., a portion of the complete polynucleotidesequence) that is similar between the two polynucleotides, and (2) mayfurther comprise a sequence that is divergent between the twopolynucleotides, sequence comparisons between two (or more)polynucleotides are typically performed by comparing sequences of thetwo polynucleotides over a “comparison window” to identify and comparelocal regions of sequence similarity. A “comparison window”, as usedherein, refers to a conceptual segment of at least 20 contiguousnucleotide positions wherein a polynucleotide sequence may be comparedto a reference sequence of at least 20 contiguous nucleotides andwherein the portion of the polynucleotide sequence in the comparisonwindow may comprise additions or deletions (i.e., gaps) of 20 percent orless as compared to the reference sequence (which does not compriseadditions or deletions) for optimal alignment of the two sequences.Optimal alignment of sequences for aligning a comparison window may beconducted by the local homology algorithm of Smith and Waterman [Smithand Waterman, Adv. Appl. Math. 2: 482 (1981)] by the homology alignmentalgorithm of Needleman and Wunsch [Needleman and Wunsch, J. Mol. Biol.48:443 (1970)], by the search for similarity method of Pearson andLipman [Pearson and Lipman, Proc. Natl. Acad. Sci. (U.S.A.) 85:2444(1988)], by computerized implementations of these algorithms (GAP,BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software PackageRelease 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.),or by inspection, and the best alignment (i.e., resulting in the highestpercentage of homology over the comparison window) generated by thevarious methods is selected. The term “sequence identity” means that twopolynucleotide sequences are identical (i.e., on anucleotide-by-nucleotide basis) over the window of comparison. The term“percentage of sequence identity” is calculated by comparing twooptimally aligned sequences over the window of comparison, determiningthe number of positions at which the identical nucleic acid base (e.g.,A, T, C, G, U, or I) occurs in both sequences to yield the number ofmatched positions, dividing the number of matched positions by the totalnumber of positions in the window of comparison (i.e., the window size),and multiplying the result by 100 to yield the percentage of sequenceidentity. The terms “substantial identity” as used herein denotes acharacteristic of a polynucleotide sequence, wherein the polynucleotidecomprises a sequence that has at least 85 percent sequence identity,preferably at least 90 to 95 percent sequence identity, more usually atleast 99 percent sequence identity as compared to a reference sequenceover a comparison window of at least 20 nucleotide positions, frequentlyover a window of at least 25-50 nucleotides, wherein the percentage ofsequence identity is calculated by comparing the reference sequence tothe polynucleotide sequence which may include deletions or additionswhich total 20 percent or less of the reference sequence over the windowof comparison. The reference sequence may be a subset of a largersequence, for example, as a segment of the full-length sequences of thecompositions claimed in the present invention (e.g., NPHP4).

[0080] As applied to polypeptides, the term “substantial identity” meansthat two peptide sequences, when optimally aligned, such as by theprograms GAP or BESTFIT using default gap weights, share at least 80percent sequence identity, preferably at least 90 percent sequenceidentity, more preferably at least 95 percent sequence identity or more(e.g., 99 percent sequence identity). Preferably, residue positions thatare not identical differ by conservative amino acid substitutions.Conservative amino acid substitutions refer to the interchangeability ofresidues having similar side chains. For example, a group of amino acidshaving aliphatic side chains is glycine, alanine, valine, leucine, andisoleucine; a group of amino acids having aliphatic-hydroxyl side chainsis serine and threonine; a group of amino acids having amide-containingside chains is asparagine and glutamine; a group of amino acids havingaromatic side chains is phenylalanine, tyrosine, and tryptophan; a groupof amino acids having basic side chains is lysine, arginine, andhistidine; and a group of amino acids having sulfur-containing sidechains is cysteine and methionine. Preferred conservative amino acidssubstitution groups are: valine-leucine-isoleucine,phenylalanine-tyrosine, lysine-arginine, alanine-valine, andasparagine-glutamine.

[0081] The term “fragment” as used herein refers to a polypeptide thathas an amino-terminal and/or carboxy-terminal deletion as compared tothe native protein, but where the remaining amino acid sequence isidentical to the corresponding positions in the amino acid sequencededuced from a full-length cDNA sequence. Fragments typically are atleast 4 amino acids long, preferably at least 20 amino acids long,usually at least 50 amino acids long or longer, and span the portion ofthe polypeptide required for intermolecular binding of the compositions(claimed in the present invention) with its various ligands and/orsubstrates.

[0082] The term “polymorphic locus” is a locus present in a populationthat shows variation between members of the population (i.e., the mostcommon allele has a frequency of less than 0.95). In contrast, a“monomorphic locus” is a genetic locus at little or no variations seenbetween members of the population (generally taken to be a locus atwhich the most common allele exceeds a frequency of 0.95 in the genepool of the population).

[0083] As used herein, the term “genetic variation information” or“genetic variant information” refers to the presence or absence of oneor more variant nucleic acid sequences (e.g., polymorphism or mutations)in a given allele of a particular gene (e.g., the NPHP4 gene).

[0084] As used herein, the term “detection assay” refers to an assay fordetecting the presence of absence of variant nucleic acid sequences(e.g., polymorphism or mutations) in a given allele of a particular gene(e.g., the NPHP4 gene). Examples of suitable detection assays include,but are not limited to, those described below in Section III B.

[0085] The term “naturally-occurring” as used herein as applied to anobject refers to the fact that an object can be found in nature. Forexample, a polypeptide or polynucleotide sequence that is present in anorganism (including viruses) that can be isolated from a source innature and which has not been intentionally modified by man in thelaboratory is naturally-occurring.

[0086] “Amplification” is a special case of nucleic acid replicationinvolving template specificity. It is to be contrasted with non-specifictemplate replication (i.e., replication that is template-dependent butnot dependent on a specific template). Template specificity is heredistinguished from fidelity of replication (i.e., synthesis of theproper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-)specificity. Template specificity is frequently described in terms of“target” specificity. Target sequences are “targets” in the sense thatthey are sought to be sorted out from other nucleic acid. Amplificationtechniques have been designed primarily for this sorting out.

[0087] Template specificity is achieved in most amplification techniquesby the choice of enzyme. Amplification enzymes are enzymes that, underconditions they are used, will process only specific sequences ofnucleic acid in a heterogeneous mixture of nucleic acid. For example, inthe case of Qβ replicase, MDV-1 RNA is the specific template for thereplicase (D. L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038[1972]). Other nucleic acid will not be replicated by this amplificationenzyme. Similarly, in the case of T7 RNA polymerase, this amplificationenzyme has a stringent specificity for its own promoters (Chamberlin etal., Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzymewill not ligate the two oligonucleotides or polynucleotides, where thereis a mismatch between the oligonucleotide or polynucleotide substrateand the template at the ligation junction (D. Y. Wu and R. B. Wallace,Genomics 4:560 [1989]). Finally, Taq and Pfu polymerases, by virtue oftheir ability to function at high temperature, are found to display highspecificity for the sequences bounded and thus defined by the primers;the high temperature results in thermodynamic conditions that favorprimer hybridization with the target sequences and not hybridizationwith non-target sequences (H. A. Erlich (ed.), PCR Technology, StocktonPress [1989]).

[0088] As used herein, the term “amplifiable nucleic acid” is used inreference to nucleic acids that may be amplified by any amplificationmethod. It is contemplated that “amplifiable nucleic acid” will usuallycomprise “sample template.”

[0089] As used herein, the term “sample template” refers to nucleic acidoriginating from a sample that is analyzed for the presence of “target”(defined below). In contrast, “background template” is used in referenceto nucleic acid other than sample template that may or may not bepresent in a sample. Background template is most often inadvertent. Itmay be the result of carryover, or it may be due to the presence ofnucleic acid contaminants sought to be purified away from the sample.For example, nucleic acids from organisms other than those to bedetected may be present as background in a test sample.

[0090] As used herein, the term “primer” refers to an oligonucleotide,whether occurring naturally as in a purified restriction digest orproduced synthetically, which is capable of acting as a point ofinitiation of synthesis when placed under conditions in which synthesisof a primer extension product which is complementary to a nucleic acidstrand is induced, (i.e., in the presence of nucleotides and an inducingagent such as DNA polymerase and at a suitable temperature and pH). Theprimer is preferably single stranded for maximum efficiency inamplification, but may alternatively be double stranded. If doublestranded, the primer is first treated to separate its strands beforebeing used to prepare extension products. Preferably, the primer is anoligodeoxyribonucleotide. The primer must be sufficiently long to primethe synthesis of extension products in the presence of the inducingagent. The exact lengths of the primers will depend on many factors,including temperature, source of primer and the use of the method.

[0091] As used herein, the term “probe” refers to an oligonucleotide(i.e., a sequence of nucleotides), whether occurring naturally as in apurified restriction digest or produced synthetically, recombinantly orby PCR amplification, that is capable of hybridizing to anotheroligonucleotide of interest. A probe may be single-stranded ordouble-stranded. Probes are useful in the detection, identification andisolation of particular gene sequences. It is contemplated that anyprobe used in the present invention will be labeled with any “reportermolecule,” so that is detectable in any detection system, including, butnot limited to enzyme (e.g., ELISA, as well as enzyme-basedhistochemical assays), fluorescent, radioactive, and luminescentsystems. It is not intended that the present invention be limited to anyparticular detection system or label.

[0092] As used herein, the term “target,” refers to a nucleic acidsequence or structure to be detected or characterized. Thus, the“target” is sought to be sorted out from other nucleic acid sequences. A“segment” is defined as a region of nucleic acid within the targetsequence.

[0093] As used herein, the term “polymerase chain reaction” (“PCR”)refers to the method of K. B. Mullis U.S. Pat. Nos. 4,683,195,4,683,202, and 4,965,188, hereby incorporated by reference, thatdescribe a method for increasing the concentration of a segment of atarget sequence in a mixture of genomic DNA without cloning orpurification. This process for amplifying the target sequence consistsof introducing a large excess of two oligonucleotide primers to the DNAmixture containing the desired target sequence, followed by a precisesequence of thermal cycling in the presence of a DNA polymerase. The twoprimers are complementary to their respective strands of the doublestranded target sequence. To effect amplification, the mixture isdenatured and the primers then annealed to their complementary sequenceswithin the target molecule. Following annealing, the primers areextended with a polymerase so as to form a new pair of complementarystrands. The steps of denaturation, primer annealing, and polymeraseextension can be repeated many times (i.e., denaturation, annealing andextension constitute one “cycle”; there can be numerous “cycles”) toobtain a high concentration of an amplified segment of the desiredtarget sequence. The length of the amplified segment of the desiredtarget sequence is determined by the relative positions of the primerswith respect to each other, and therefore, this length is a controllableparameter. By virtue of the repeating aspect of the process, the methodis referred to as the “polymerase chain reaction” (hereinafter “PCR”).Because the desired amplified segments of the target sequence become thepredominant sequences (in terms of concentration) in the mixture, theyare said to be “PCR amplified.”

[0094] With PCR, it is possible to amplify a single copy of a specifictarget sequence in genomic DNA to a level detectable by severaldifferent methodologies (e.g., hybridization with a labeled probe;incorporation of biotinylated primers followed by avidin-enzymeconjugate detection; incorporation of ³²P-labeled deoxynucleotidetriphosphates, such as dCTP or dATP, into the amplified segment). Inaddition to genomic DNA, any oligonucleotide or polynucleotide sequencecan be amplified with the appropriate set of primer molecules. Inparticular, the amplified segments created by the PCR process itselfare, themselves, efficient templates for subsequent PCR amplifications.

[0095] As used herein, the terms “PCR product,” “PCR fragment,” and“amplification product” refer to the resultant mixture of compoundsafter two or more cycles of the PCR steps of denaturation, annealing andextension are complete. These terms encompass the case where there hasbeen amplification of one or more segments of one or more targetsequences.

[0096] As used herein, the term “amplification reagents” refers to thosereagents (deoxyribonucleotide triphosphates, buffer, etc.), needed foramplification except for primers, nucleic acid template, and theamplification enzyme. Typically, amplification reagents along with otherreaction components are placed and contained in a reaction vessel (testtube, microwell, etc.).

[0097] As used herein, the terms “restriction endonucleases” and“restriction enzymes” refer to bacterial enzymes, each of which cutdouble-stranded DNA at or near a specific nucleotide sequence.

[0098] As used herein, the term “recombinant DNA molecule” as usedherein refers to a DNA molecule that is comprised of segments of DNAjoined together by means of molecular biological techniques.

[0099] As used herein, the term “antisense” is used in reference to RNAsequences that are complementary to a specific RNA sequence (e.g.,mRNA). Included within this definition are antisense RNA (“asRNA”)molecules involved in gene regulation by bacteria. Antisense RNA may beproduced by any method, including synthesis by splicing the gene(s) ofinterest in a reverse orientation to a viral promoter that permits thesynthesis of a coding strand. Once introduced into an embryo, thistranscribed strand combines with natural mRNA produced by the embryo toform duplexes. These duplexes then block either the furthertranscription of the mRNA or its translation. In this manner, mutantphenotypes may be generated. The term “antisense strand” is used inreference to a nucleic acid strand that is complementary to the “sense”strand. The designation (−) (i.e., “negative”) is sometimes used inreference to the antisense strand, with the designation (+) sometimesused in reference to the sense (i.e., “positive”) strand.

[0100] The term “isolated” when used in relation to a nucleic acid, asin “an isolated oligonucleotide” or “isolated polynucleotide” refers toa nucleic acid sequence that is identified and separated from at leastone contaminant nucleic acid with which it is ordinarily associated inits natural source. Isolated nucleic acid is present in a form orsetting that is different from that in which it is found in nature. Incontrast, non-isolated nucleic acids are nucleic acids such as DNA andRNA found in the state they exist in nature. For example, a given DNAsequence (e.g., a gene) is found on the host cell chromosome inproximity to neighboring genes; RNA sequences, such as a specific mRNAsequence encoding a specific protein, are found in the cell as a mixturewith numerous other mRNAs that encode a multitude of proteins. However,isolated nucleic acid encoding NPHP4 includes, by way of example, suchnucleic acid in cells ordinarily expressing NPHP4 where the nucleic acidis in a chromosomal location different from that of natural cells, or isotherwise flanked by a different nucleic acid sequence than that foundin nature. The isolated nucleic acid, oligonucleotide, or polynucleotidemay be present in single-stranded or double-stranded form. When anisolated nucleic acid, oligonucleotide or polynucleotide is to beutilized to express a protein, the oligonucleotide or polynucleotidewill contain at a minimum the sense or coding strand (i.e., theoligonucleotide or polynucleotide may single-stranded), but may containboth the sense and anti-sense strands (i.e., the oligonucleotide orpolynucleotide may be double-stranded).

[0101] As used herein, a “portion of a chromosome” refers to a discretesection of the chromosome. Chromosomes are divided into sites orsections by cytogeneticists as follows: the short (relative to thecentromere) arm of a chromosome is termed the “p” arm; the long arm istermed the “q” arm. Each arm is then divided into 2 regions termedregion 1 and region 2 (region 1 is closest to the centromere). Eachregion is further divided into bands. The bands may be further dividedinto sub-bands. For example, the 11p15.5 portion of human chromosome 11is the portion located on chromosome 11 (11) on the short arm (p) in thefirst region (1) in the 5th band (5) in sub-band 5 (0.5). A portion of achromosome may be “altered;” for instance the entire portion may beabsent due to a deletion or may be rearranged (e.g., inversions,translocations, expanded or contracted due to changes in repeatregions). In the case of a deletion, an attempt to hybridize (i.e.,specifically bind) a probe homologous to a particular portion of achromosome could result in a negative result (i.e., the probe could notbind to the sample containing genetic material suspected of containingthe missing portion of the chromosome). Thus, hybridization of a probehomologous to a particular portion of a chromosome may be used to detectalterations in a portion of a chromosome.

[0102] The term “sequences associated with a chromosome” meanspreparations of chromosomes (e.g., spreads of metaphase chromosomes),nucleic acid extracted from a sample containing chromosomal DNA (e.g.,preparations of genomic DNA); the RNA that is produced by transcriptionof genes located on a chromosome (e.g., hnRNA and mRNA), and cDNA copiesof the RNA transcribed from the DNA located on a chromosome. Sequencesassociated with a chromosome may be detected by numerous techniquesincluding probing of Southern and Northern blots and in situhybridization to RNA, DNA, or metaphase chromosomes with probescontaining sequences homologous to the nucleic acids in the above listedpreparations.

[0103] As used herein the term “portion” when in reference to anucleotide sequence (as in “a portion of a given nucleotide sequence”)refers to fragments of that sequence. The fragments may range in sizefrom four nucleotides to the entire nucleotide sequence minus onenucleotide (10 nucleotides, 20, 30, 40, 50, 100, 200, etc.).

[0104] As used herein the term “coding region” when used in reference tostructural gene refers to the nucleotide sequences that encode the aminoacids found in the nascent polypeptide as a result of translation of amRNA molecule. The coding region is bounded, in eukaryotes, on the 5′side by the nucleotide triplet “ATG” that encodes the initiatormethionine and on the 3′ side by one of the three triplets, whichspecify stop codons (i.e., TAA, TAG, TGA).

[0105] As used herein, the term “purified” or “to purify” refers to theremoval of contaminants from a sample. For example, NPHP4 antibodies arepurified by removal of contaminating non-immunoglobulin proteins; theyare also purified by the removal of immunoglobulin that does not bindNPHP4. The removal of non-immunoglobulin proteins and/or the removal ofimmunoglobulins that do not bind NPHP4 results in an increase in thepercent of NPHP4-reactive immunoglobulins in the sample. In anotherexample, recombinant NPHP4 polypeptides are expressed in bacterial hostcells and the polypeptides are purified by the removal of host cellproteins; the percent of recombinant NPHP4 polypeptides is therebyincreased in the sample.

[0106] The term “recombinant DNA molecule” as used herein refers to aDNA molecule that is comprised of segments of DNA joined together bymeans of molecular biological techniques.

[0107] The term “recombinant protein” or “recombinant polypeptide” asused herein refers to a protein molecule that is expressed from arecombinant DNA molecule.

[0108] The term “native protein” as used herein to indicate that aprotein does not contain amino acid residues encoded by vectorsequences; that is the native protein contains only those amino acidsfound in the protein as it occurs in nature. A native protein may beproduced by recombinant means or may be isolated from a naturallyoccurring source.

[0109] As used herein the term “portion” when in reference to a protein(as in “a portion of a given protein”) refers to fragments of thatprotein. The fragments may range in size from four consecutive aminoacid residues to the entire amino acid sequence minus one amino acid.

[0110] The term “Southern blot,” refers to the analysis of DNA onagarose or acrylamide gels to fractionate the DNA according to sizefollowed by transfer of the DNA from the gel to a solid support, such asnitrocellulose or a nylon membrane. The immobilized DNA is then probedwith a labeled probe to detect DNA species complementary to the probeused. The DNA may be cleaved with restriction enzymes prior toelectrophoresis. Following electrophoresis, the DNA may be partiallydepurinated and denatured prior to or during transfer to the solidsupport. Southern blots are a standard tool of molecular biologists (J.Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Press, NY, pp 9.31-9.58 [1989]).

[0111] The term “Northern blot,” as used herein refers to the analysisof RNA by electrophoresis of RNA on agarose gels to fractionate the RNAaccording to size followed by transfer of the RNA from the gel to asolid support, such as nitrocellulose or a nylon membrane. Theimmobilized RNA is then probed with a labeled probe to detect RNAspecies complementary to the probe used. Northern blots are a standardtool of molecular biologists (J. Sambrook, et al., supra, pp 7.39-7.52[1989]).

[0112] The term “Western blot” refers to the analysis of protein(s) (orpolypeptides) immobilized onto a support such as nitrocellulose or amembrane. The proteins are run on acrylamide gels to separate theproteins, followed by transfer of the protein from the gel to a solidsupport, such as nitrocellulose or a nylon membrane. The immobilizedproteins are then exposed to antibodies with reactivity against anantigen of interest. The binding of the antibodies may be detected byvarious methods, including the use of radiolabeled antibodies.

[0113] The term “antigenic determinant” as used herein refers to thatportion of an antigen that makes contact with a particular antibody(i.e., an epitope). When a protein or fragment of a protein is used toimmunize a host animal, numerous regions of the protein may induce theproduction of antibodies that bind specifically to a given region orthree-dimensional structure on the protein; these regions or structuresare referred to as antigenic determinants. An antigenic determinant maycompete with the intact antigen (i.e., the “immunogen” used to elicitthe immune response) for binding to an antibody.

[0114] The term “transgene” as used herein refers to a foreign,heterologous, or autologous gene that is placed into an organism byintroducing the gene into newly fertilized eggs or early embryos. Theterm “foreign gene” refers to any nucleic acid (e.g., gene sequence)that is introduced into the genome of an animal by experimentalmanipulations and may include gene sequences found in that animal solong as the introduced gene does not reside in the same location as doesthe naturally-occurring gene. The term “autologous gene” is intended toencompass variants (e.g., polymorphisms or mutants) of the naturallyoccurring gene. The term transgene thus encompasses the replacement ofthe naturally occurring gene with a variant form of the gene.

[0115] As used herein, the term “vector” is used in reference to nucleicacid molecules that transfer DNA segment(s) from one cell to another.The term “vehicle” is sometimes used interchangeably with “vector.”

[0116] The term “expression vector” as used herein refers to arecombinant DNA molecule containing a desired coding sequence andappropriate nucleic acid sequences necessary for the expression of theoperably linked coding sequence in a particular host organism. Nucleicacid sequences necessary for expression in prokaryotes usually include apromoter, an operator (optional), and a ribosome binding site, oftenalong with other sequences. Eukaryotic cells are known to utilizepromoters, enhancers, and termination and polyadenylation signals.

[0117] As used herein, the term “host cell” refers to any eukaryotic orprokaryotic cell (e.g., bacterial cells such as E. coli, yeast cells,mammalian cells, avian cells, amphibian cells, plant cells, fish cells,and insect cells), whether located in vitro or in vivo. For example,host cells may be located in a transgenic animal.

[0118] The terms “overexpression” and “overexpressing” and grammaticalequivalents, are used in reference to levels of mRNA to indicate a levelof expression approximately 3-fold higher than that typically observedin a given tissue in a control or non-transgenic animal. Levels of mRNAare measured using any of a number of techniques known to those skilledin the art including, but not limited to Northern blot analysis (See,Example 10, for a protocol for performing Northern blot analysis).Appropriate controls are included on the Northern blot to control fordifferences in the amount of RNA loaded from each tissue analyzed (e.g.,the amount of 28S rRNA, an abundant RNA transcript present atessentially the same amount in all tissues, present in each sample canbe used as a means of normalizing or standardizing the RAD50mRNA-specific signal observed on Northern blots). The amount of mRNApresent in the band corresponding in size to the correctly spliced NPHP4transgene RNA is quantified; other minor species of RNA which hybridizeto the transgene probe are not considered in the quantification of theexpression of the transgenic mRNA.

[0119] The term “transfection” as used herein refers to the introductionof foreign DNA into eukaryotic cells. Transfection may be accomplishedby a variety of means known to the art including calcium phosphate-DNAco-precipitation, DEAE-dextran-mediated transfection, polybrene-mediatedtransfection, electroporation, microinjection, liposome fusion,lipofection, protoplast fusion, retroviral infection, and biolistics.

[0120] The term “stable transfection” or “stably transfected” refers tothe introduction and integration of foreign DNA into the genome of thetransfected cell. The term “stable transfectant” refers to a cell thathas stably integrated foreign DNA into the genomic DNA.

[0121] The term “transient transfection” or “transiently transfected”refers to the introduction of foreign DNA into a cell where the foreignDNA fails to integrate into the genome of the transfected cell. Theforeign DNA persists in the nucleus of the transfected cell for severaldays. During this time the foreign DNA is subject to the regulatorycontrols that govern the expression of endogenous genes in thechromosomes. The term “transient transfectant” refers to cells that havetaken up foreign DNA but have failed to integrate this DNA.

[0122] The term “calcium phosphate co-precipitation” refers to atechnique for the introduction of nucleic acids into a cell. The uptakeof nucleic acids by cells is enhanced when the nucleic acid is presentedas a calcium phosphate-nucleic acid co-precipitate. The originaltechnique of Graham and van der Eb (Graham and van der Eb, Virol.,52:456 [1973]), has been modified by several groups to optimizeconditions for particular types of cells. The art is well aware of thesenumerous modifications.

[0123] A “composition comprising a given polynucleotide sequence” asused herein refers broadly to any composition containing the givenpolynucleotide sequence. The composition may comprise an aqueoussolution. Compositions comprising polynucleotide sequences encodingNPHP4 (e.g., SEQ ID NO: 1) or fragments thereof may be employed ashybridization probes. In this case, the NPHP4 encoding polynucleotidesequences are typically employed in an aqueous solution containing salts(e.g., NaCl), detergents (e.g., SDS), and other components (e.g.,Denhardt's solution, dry milk, salmon sperm DNA, etc.).

[0124] The term “test compound” refers to any chemical entity,pharmaceutical, drug, and the like that can be used to treat or preventa disease, illness, sickness, or disorder of bodily function, orotherwise alter the physiological or cellular status of a sample. Testcompounds comprise both known and potential therapeutic compounds. Atest compound can be determined to be therapeutic by screening using thescreening methods of the present invention. A “known therapeuticcompound” refers to a therapeutic compound that has been shown (e.g.,through animal trials or prior experience with administration to humans)to be effective in such treatment or prevention.

[0125] The term “sample” as used herein is used in its broadest sense. Asample suspected of containing a human chromosome or sequencesassociated with a human chromosome may comprise a cell, chromosomesisolated from a cell (e.g., a spread of metaphase chromosomes), genomicDNA (in solution or bound to a solid support such as for Southern blotanalysis), RNA (in solution or bound to a solid support such as forNorthern blot analysis), cDNA (in solution or bound to a solid support)and the like. A sample suspected of containing a protein may comprise acell, a portion of a tissue, an extract containing one or more proteinsand the like.

[0126] As used herein, the term “response,” when used in reference to anassay, refers to the generation of a detectable signal (e.g.,accumulation of reporter protein, increase in ion concentration,accumulation of a detectable chemical product).

[0127] As used herein, the term “membrane receptor protein” refers tomembrane spanning proteins that bind a ligand (e.g., a hormone orneurotransmitter). As is known in the art, protein phosphorylation is acommon regulatory mechanism used by cells to selectively modify proteinscarrying regulatory signals from outside the cell to the nucleus. Theproteins that execute these biochemical modifications are a group ofenzymes known as protein kinases. They may further be defined by thesubstrate residue that they target for phosphorylation. One group ofprotein kinases is the tyrosine kinases (TKs), which selectivelyphosphorylate a target protein on its tyrosine residues. Some tyrosinekinases are membrane-bound receptors (RTKs), and, upon activation by aligand, can autophosphorylate as well as modify substrates. Theinitiation of sequential phosphorylation by ligand stimulation is aparadigm that underlies the action of such effectors as, for example,epidermal growth factor (EGF), insulin, platelet-derived growth factor(PDGF), and fibroblast growth factor (FGF). The receptors for theseligands are tyrosine kinases and provide the interface between thebinding of a ligand (hormone, growth factor) to a target cell and thetransmission of a signal into the cell by the activation of one or morebiochemical pathways. Ligand binding to a receptor tyrosine kinaseactivates its intrinsic enzymatic activity. Tyrosine kinases can also becytoplasmic, non-receptor-type enzymes and act as a downstream componentof a signal transduction pathway.

[0128] As used herein, the term “signal transduction protein” refers toproteins that are activated or otherwise affected by ligand binding to amembrane or cytostolic receptor protein or some other stimulus. Examplesof signal transduction protein include adenyl cyclase, phospholipase C,and G-proteins. Many membrane receptor proteins are coupled toG-proteins (i.e., G-protein coupled receptors (GPCRs); for a review, seeNeer, 1995, Cell 80:249-257 [1995]). Typically, GPCRs contain seventransmembrane domains. Putative GPCRs can be identified on the basis ofsequence homology to known GPCRs.

[0129] GPCRs mediate signal transduction across a cell membrane upon thebinding of a ligand to an extracellular portion of a GPCR. Theintracellular portion of a GPCR interacts with a G-protein to modulatesignal transduction from outside to inside a cell. A GPCR is thereforesaid to be “coupled” to a G-protein. G-proteins are composed of threepolypeptide subunits: an α subunit, which binds and hydrolyses GTP, anda dimeric βγ subunit. In the basal, inactive state, the G-protein existsas a heterotrimer of the α and βγ subunits. When the G-protein isinactive, guanosine diphosphate (GDP) is associated with the α subunitof the G-protein. When a GPCR is bound and activated by a ligand, theGPCR binds to the G-protein heterotrimer and decreases the affinity ofthe Gα subunit for GDP. In its active state, the G subunit exchanges GDPfor guanine triphosphate (GTP) and active Gα subunit disassociates fromboth the receptor and the dimeric βγ subunit. The disassociated, activeGα subunit transduces signals to effectors that are “downstream” in theG-protein signaling pathway within the cell. Eventually, the G-protein'sendogenous GTPase activity returns active G subunit to its inactivestate, in which it is associated with GDP and the dimeric βγ subunit.

[0130] Numerous members of the heterotrimeric G-protein family have beencloned, including more than 20 genes encoding various Gα subunits. Thevarious G subunits have been categorized into four families, on thebasis of amino acid sequences and functional homology. These fourfamilies are termed Gα_(s), Gα_(i), Gα_(q), and Gα₁₂. Functionally,these four families differ with respect to the intracellular signalingpathways that they activate and the GPCR to which they couple.

[0131] For example, certain GPCRs normally couple with Gα_(s) and,through Gα_(s), these GPCRs stimulate adenylyl cyclase activity. OtherGPCRs normally couple with GGα_(q), and through GGα_(q), these GPCRs canactivate phospholipase C (PLC), such as the β isoform of phospholipase C(i.e., PLCβ, Stermweis and Smrcka, Trends in Biochem. Sci. 17:502-506[1992]).

[0132] As used herein, the term “reporter gene” refers to a geneencoding a protein that may be assayed. Examples of reporter genesinclude, but are not limited to, luciferase (See, e.g., deWet et al.,Mol. Cell. Biol. 7:725 [1987] and U.S. Pat Nos., 6,074,859; 5,976,796;5,674,713; and 5,618,682; all of which are incorporated herein byreference), green fluorescent protein (e.g., GenBank Accession NumberU43284; a number of GFP variants are commercially available fromCLONTECH Laboratories, Palo Alto, Calif.), chloramphenicolacetyltransferase, β-galactosidase, alkaline phosphatase, and horseradish peroxidase.

[0133] As used herein, the terms “computer memory” and “computer memorydevice” refer to any storage media readable by a computer processor.Examples of computer memory include, but are not limited to, RAM, ROM,computer chips, digital video disc (DVDs), compact discs (CDs), harddisk drives (HDD), and magnetic tape.

[0134] As used herein, the term “computer readable medium” refers to anydevice or system for storing and providing information (e.g., data andinstructions) to a computer processor. Examples of computer readablemedia include, but are not limited to, DVDs, CDs, hard disk drives,magnetic tape and servers for streaming media over networks.

[0135] As used herein, the term “entering” as in “entering said geneticvariation information into said computer” refers to transferringinformation to a “computer readable medium.” Information may betransferred by any suitable method, including but not limited to,manually (e.g., by typing into a computer) or automated (e.g.,transferred from another “computer readable medium” via a “processor”).

[0136] As used herein, the terms “processor” and “central processingunit” or “CPU” are used interchangeably and refer to a device that isable to read a program from a computer memory (e.g., ROM or othercomputer memory) and perform a set of steps according to the program.

[0137] As used herein, the term “computer implemented method” refers toa method utilizing a “CPU” and “computer readable medium.”

DETAILED DESCRIPTION OF THE INVENTION

[0138] The present invention relates to Nephronophthisis, in particularto the NPHP4 protein (nephroretinin or nephrocystin-4) and nucleic acidsencoding the NPHP4 protein. The present invention also provides assaysfor the detection of NPHP4, and assays for detecting nephroretinin andinversin polymorphisms and mutations associated with disease states.

I. NPHP4 Polynucleotides

[0139] As described above, a new gene associated with NPHP4 kidneydisease has been discovered. Accordingly, the present invention providesnucleic acids encoding NPHP4 genes, homologs, variants (e.g.,polymorphisms and mutants), including but not limited to, thosedescribed in SEQ ID NO: 1. In some embodiments, the present inventionprovide polynucleotide sequences that are capable of hybridizing to SEQID NO: 1 under conditions of low to high stringency as long as thepolynucleotide sequence capable of hybridizing encodes a protein thatretains a biological activity of the naturally occurring NPHP4. In someembodiments, the protein that retains a biological activity of naturallyoccurring NPHP4 is 70% homologous to wild-type NPHP4, preferably 80%homologous to wild-type NPHP4, more preferably 90% homologous towild-type NPHP4, and most preferably 95% homologous to wild-type NPHP4.In preferred embodiments, hybridization conditions are based on themelting temperature (T_(m)) of the nucleic acid binding complex andconfer a defined “stringency” as explained above (See e.g., Wahl, etal., Meth. Enzymol., 152:399-407 [1987], incorporated herein byreference).

[0140] In other embodiments of the present invention, additional allelesof NPHP4 are provided (e.g., as shown in Example 1). In preferredembodiments, alleles result from a polymorphism or mutation (i.e., achange in the nucleic acid sequence) and generally produce altered mRNAsor polypeptides whose structure or function may or may not be altered.Any given gene may have none, one or many allelic forms. Commonmutational changes that give rise to alleles are generally ascribed todeletions, additions or substitutions of nucleic acids. Each of thesetypes of changes may occur alone, or in combination with the others, andat the rate of one or more times in a given sequence. Examples of thealleles of the present invention include those encoded by SEQ ID NOs:1(wild type) and disease alleles described herein (e.g., SEQ ID NOs: 5,7, 9, 11, 13, 15, 17, and 19).

[0141] In still other embodiments of the present invention, thenucleotide sequences of the present invention may be engineered in orderto alter an NPHP4 coding sequence for a variety of reasons, includingbut not limited to, alterations which modify the cloning, processingand/or expression of the gene product. For example, mutations may beintroduced using techniques that are well known in the art (e.g.,site-directed mutagenesis to insert new restriction sites, to alterglycosylation patterns, to change codon preference, etc.).

[0142] In some embodiments of the present invention, the polynucleotidesequence of NPHP4 may be extended utilizing the nucleotide sequence(e.g., SEQ ID NO: 1) in various methods known in the art to detectupstream sequences such as promoters and regulatory elements. Forexample, it is contemplated that restriction-site polymerase chainreaction (PCR) will find use in the present invention. This is a directmethod that uses universal primers to retrieve unknown sequence adjacentto a known locus (Gobinda et al., PCR Methods Applic., 2:318-22 [1993]).First, genomic DNA is amplified in the presence of a primer to a linkersequence and a primer specific to the known region. The amplifiedsequences are then subjected to a second round of PCR with the samelinker primer and another specific primer internal to the first one.Products of each round of PCR are transcribed with an appropriate RNApolymerase and sequenced using reverse transcriptase.

[0143] In another embodiment, inverse PCR can be used to amplify orextend sequences using divergent primers based on a known region(Triglia et al., Nucleic Acids Res., 16:8186 [1988]). The primers may bedesigned using Oligo 4.0 (National Biosciences Inc, Plymouth Minn.), oranother appropriate program, to be 22-30 nucleotides in length, to havea GC content of 50% or more, and to anneal to the target sequence attemperatures about 68-72° C. The method uses several restriction enzymesto generate a suitable fragment in the known region of a gene. Thefragment is then circularized by intramolecular ligation and used as aPCR template. In still other embodiments, walking PCR is utilized.Walking PCR is a method for targeted gene walking that permits retrievalof unknown sequence (Parker et al., Nucleic Acids Res., 19:3055-60[1991]). The PROMOTERFINDER kit (Clontech) uses PCR, nested primers andspecial libraries to “walk in” genomic DNA. This process avoids the needto screen libraries and is useful in finding intron/exon junctions.

[0144] Preferred libraries for screening for full length cDNAs includemammalian libraries that have been size-selected to include largercDNAs. Also, random primed libraries are preferred, in that they willcontain more sequences that contain the 5′ and upstream gene regions. Arandomly primed library may be particularly useful in case where anoligo d(T) library does not yield full-length cDNA. Genomic mammalianlibraries are useful for obtaining introns and extending 5′ sequence.

[0145] In other embodiments of the present invention, variants of thedisclosed NPHP4 sequences are provided. In preferred embodiments,variants result from polymorphisms or mutations (i.e., a change in thenucleic acid sequence) and generally produce altered mRNAs orpolypeptides whose structure or function may or may not be altered. Anygiven gene may have none, one, or many variant forms. Common mutationalchanges that give rise to variants are generally ascribed to deletions,additions or substitutions of nucleic acids. Each of these types ofchanges may occur alone, or in combination with the others, and at therate of one or more times in a given sequence.

[0146] It is contemplated that it is possible to modify the structure ofa peptide having a function (e.g., NPHP4 function) for such purposes asaltering the biological activity (e.g., prevention of cystic kidneydisease). Such modified peptides are considered functional equivalentsof peptides having an activity of NPHP4 as defined herein. A modifiedpeptide can be produced in which the nucleotide sequence encoding thepolypeptide has been altered, such as by substitution, deletion, oraddition. In particularly preferred embodiments, these modifications donot significantly reduce the biological activity of the modified NPHP4.In other words, construct “X” can be evaluated in order to determinewhether it is a member of the genus of modified or variant NPHP4's ofthe present invention as defined functionally, rather than structurally.In preferred embodiments, the activity of variant NPHP4 polypeptides isevaluated by methods described herein (e.g., the generation oftransgenic animals).

[0147] Moreover, as described above, variant forms of NPHP4 are alsocontemplated as being equivalent to those peptides and DNA moleculesthat are set forth in more detail herein. For example, it iscontemplated that isolated replacement of a leucine with an isoleucineor valine, an aspartate with a glutamate, a threonine with a serine, ora similar replacement of an amino acid with a structurally related aminoacid (i.e., conservative mutations) will not have a major effect on thebiological activity of the resulting molecule. Accordingly, someembodiments of the present invention provide variants of NPHP4 disclosedherein containing conservative replacements. Conservative replacementsare those that take place within a family of amino acids that arerelated in their side chains. Genetically encoded amino acids can bedivided into four families: (1) acidic (aspartate, glutamate); (2) basic(lysine, arginine, histidine); (3) nonpolar (alanine, valine, leucine,isoleucine, proline, phenylalanine, methionine, tryptophan); and (4)uncharged polar (glycine, asparagine, glutamine, cysteine, serine,threonine, tyrosine). Phenylalanine, tryptophan, and tyrosine aresometimes classified jointly as aromatic amino acids. In similarfashion, the amino acid repertoire can be grouped as (1) acidic(aspartate, glutamate); (2) basic (lysine, arginine, histidine), (3)aliphatic (glycine, alanine, valine, leucine, isoleucine, serine,threonine), with serine and threonine optionally be grouped separatelyas aliphatic-hydroxyl; (4) aromatic (phenylalanine, tyrosine,tryptophan); (5) amide (asparagine, glutamine); and (6) sulfur-containing (cysteine and methionine) (e.g., Stryer ed., Biochemistry,pg. 17-21, 2nd ed, W H Freeman and Co., 1981). Whether a change in theamino acid sequence of a peptide results in a functional polypeptide canbe readily determined by assessing the ability of the variant peptide tofunction in a fashion similar to the wild-type protein. Peptides havingmore than one replacement can readily be tested in the same manner.

[0148] More rarely, a variant includes “nonconservative” changes (e.g.,replacement of a glycine with a tryptophan). Analogous minor variationscan also include amino acid deletions or insertions, or both. Guidancein determining which amino acid residues can be substituted, inserted,or deleted without abolishing biological activity can be found usingcomputer programs (e.g., LASERGENE software, DNASTAR Inc., Madison,Wis.).

[0149] As described in more detail below, variants may be produced bymethods such as directed evolution or other techniques for producingcombinatorial libraries of variants, described in more detail below. Instill other embodiments of the present invention, the nucleotidesequences of the present invention may be engineered in order to alter aNPHP4 coding sequence including, but not limited to, alterations thatmodify the cloning, processing, localization, secretion, and/orexpression of the gene product. For example, mutations may be introducedusing techniques that are well known in the art (e.g., site-directedmutagenesis to insert new restriction sites, alter glycosylationpatterns, or change codon preference, etc.).

II. NPHP4 Polypeptides

[0150] In other embodiments, the present invention provides NPHP4polynucleotide sequences that encode NPHP4 polypeptide sequences. NPHP4polypeptides (e.g., SEQ ID NOs: 2, 6, 8, 10, 12, 14, 16, 18, and 20) aredescribed in FIGS. 4-13. Other embodiments of the present inventionprovide fragments, fusion proteins or functional equivalents of theseNPHP4 proteins. In some embodiments, the present invention providestruncation mutants of NPHP4 (e.g., SEQ ID NOs: 6, 10, 12, 14, 16, and20). In still other embodiment of the present invention, nucleic acidsequences corresponding to NPHP4 variants, homologs, and mutants may beused to generate recombinant DNA molecules that direct the expression ofthe NPHP4 variants, homologs, and mutants in appropriate host cells. Insome embodiments of the present invention, the polypeptide may be anaturally purified product, in other embodiments it may be a product ofchemical synthetic procedures, and in still other embodiments it may beproduced by recombinant techniques using a prokaryotic or eukaryotichost (e.g., by bacterial, yeast, higher plant, insect and mammaliancells in culture). In some embodiments, depending upon the host employedin a recombinant production procedure, the polypeptide of the presentinvention may be glycosylated or may be non-glycosylated. In otherembodiments, the polypeptides of the invention may also include aninitial methionine amino acid residue.

[0151] In one embodiment of the present invention, due to the inherentdegeneracy of the genetic code, DNA sequences other than thepolynucleotide sequences of SEQ ID NO: 1 that encode substantially thesame or a functionally equivalent amino acid sequence, may be used toclone and express NPHP4. In general, such polynucleotide sequenceshybridize to SEQ ID NO:1 under conditions of high to medium stringencyas described above. As will be understood by those of skill in the art,it may be advantageous to produce NPHP4-encoding nucleotide sequencespossessing non-naturally occurring codons. Therefore, in some preferredembodiments, codons preferred by a particular prokaryotic or eukaryotichost (Murray et al., Nucl. Acids Res., 17 [1989]) are selected, forexample, to increase the rate of NPHP4 expression or to producerecombinant RNA transcripts having desirable properties, such as alonger half-life, than transcripts produced from naturally occurringsequence.

1. Vectors for Production of NPHP4

[0152] The polynucleotides of the present invention may be employed forproducing polypeptides by recombinant techniques. Thus, for example, thepolynucleotide may be included in any one of a variety of expressionvectors for expressing a polypeptide. In some embodiments of the presentinvention, vectors include, but are not limited to, chromosomal,nonchromosomal and synthetic DNA sequences (e.g., derivatives of SV40,bacterial plasmids, phage DNA; baculovirus, yeast plasmids, vectorsderived from combinations of plasmids and phage DNA, and viral DNA suchas vaccinia, adenovirus, fowl pox virus, and pseudorabies). It iscontemplated that any vector may be used as long as it is replicable andviable in the host.

[0153] In particular, some embodiments of the present invention providerecombinant constructs comprising one or more of the sequences asbroadly described above (e.g., SEQ ID NOs: 1, 5, 7, 9, 11, 13, 15, 17,and 19). In some embodiments of the present invention, the constructscomprise a vector, such as a plasmid or viral vector, into which asequence of the invention has been inserted, in a forward or reverseorientation. In still other embodiments, the heterologous structuralsequence (e.g., SEQ ID NO: 1) is assembled in appropriate phase withtranslation initiation and termination sequences. In preferredembodiments of the present invention, the appropriate DNA sequence isinserted into the vector using any of a variety of procedures. Ingeneral, the DNA sequence is inserted into an appropriate restrictionendonuclease site(s) by procedures known in the art.

[0154] Large numbers of suitable vectors are known to those of skill inthe art, and are commercially available. Such vectors include, but arenot limited to, the following vectors: 1) Bacterial—pQE70, pQE60, pQE-9(Qiagen), pBS, pD10, phagescript, psiX174, pbluescript SK, pBSKS, pNH8A,pNH16a, pNH18A, pNH46A (Stratagene); ptrc99a, pKK223-3, pKK233-3,pDR540, pRIT5 (Pharmacia); 2) Eukaryotic—pWLNEO, pSV2CAT, pOG44, PXT1,pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia); and 3)Baculovirus—pPbac and pMbac (Stratagene). Any other plasmid or vectormay be used as long as they are replicable and viable in the host. Insome preferred embodiments of the present invention, mammalianexpression vectors comprise an origin of replication, a suitablepromoter and enhancer, and also any necessary ribosome binding sites,polyadenylation sites, splice donor and acceptor sites, transcriptionaltermination sequences, and 5′ flanking non-transcribed sequences. Inother embodiments, DNA sequences derived from the SV40 splice, andpolyadenylation sites may be used to provide the requirednon-transcribed genetic elements.

[0155] In certain embodiments of the present invention, the DNA sequencein the expression vector is operatively linked to an appropriateexpression control sequence(s) (promoter) to direct mRNA synthesis.Promoters useful in the present invention include, but are not limitedto, the LTR or SV40 promoter, the E. coli lac or trp, the phage lambdaP_(L) and P_(R), T3 and T7 promoters, and the cytomegalovirus (CMV)immediate early, herpes simplex virus (HSV) thymidine kinase, and mousemetallothionein-I promoters and other promoters known to controlexpression of gene in prokaryotic or eukaryotic cells or their viruses.In other embodiments of the present invention, recombinant expressionvectors include origins of replication and selectable markers permittingtransformation of the host cell (e.g., dihydrofolate reductase orneomycin resistance for eukaryotic cell culture, or tetracycline orampicillin resistance in E. coli).

[0156] In some embodiments of the present invention, transcription ofthe DNA encoding the polypeptides of the present invention by highereukaryotes is increased by inserting an enhancer sequence into thevector. Enhancers are cis-acting elements of DNA, usually about from 10to 300 bp that act on a promoter to increase its transcription.Enhancers useful in the present invention include, but are not limitedto, the SV40 enhancer on the late side of the replication origin bp 100to 270, a cytomegalovirus early promoter enhancer, the polyoma enhanceron the late side of the replication origin, and adenovirus enhancers.

[0157] In other embodiments, the expression vector also contains aribosome binding site for translation initiation and a transcriptionterminator. In still other embodiments of the present invention, thevector may also include appropriate sequences for amplifying expression.

2. Host Cells for Production of NPHP4

[0158] In a further embodiment, the present invention provides hostcells containing the above-described constructs. In some embodiments ofthe present invention, the host cell is a higher eukaryotic cell (e.g.,a mammalian or insect cell). In other embodiments of the presentinvention, the host cell is a lower eukaryotic cell (e.g., a yeastcell). In still other embodiments of the present invention, the hostcell can be a prokaryotic cell (e.g., a bacterial cell). Specificexamples of host cells include, but are not limited to, Escherichiacoli, Salmonella typhimurium, Bacillus subtilis, and various specieswithin the genera Pseudomonas, Streptomyces, and Staphylococcus, as wellas Saccharomycees cerivisiae, Schizosaccharomycees pombe, Drosophila S2cells, Spodoptera Sf9 cells, Chinese hamster ovary (CHO) cells, COS-7lines of monkey kidney fibroblasts, (Gluzman, Cell 23:175 [1981]), C127,3T3, 293, 293T, HeLa and BHK cell lines.

[0159] The constructs in host cells can be used in a conventional mannerto produce the gene product encoded by the recombinant sequence. In someembodiments, introduction of the construct into the host cell can beaccomplished by calcium phosphate transfection, DEAE-Dextran mediatedtransfection, or electroporation (See e.g., Davis et al., Basic Methodsin Molecular Biology, [1986]). Alternatively, in some embodiments of thepresent invention, the polypeptides of the invention can besynthetically produced by conventional peptide synthesizers.

[0160] Proteins can be expressed in mammalian cells, yeast, bacteria, orother cells under the control of appropriate promoters. Cell-freetranslation systems can also be employed to produce such proteins usingRNAs derived from the DNA constructs of the present invention.Appropriate cloning and expression vectors for use with prokaryotic andeukaryotic hosts are described by Sambrook, et al., Molecular Cloning: ALaboratory Manual, Second Edition, Cold Spring Harbor, N.Y., [1989].

[0161] In some embodiments of the present invention, followingtransformation of a suitable host strain and growth of the host strainto an appropriate cell density, the selected promoter is induced byappropriate means (e.g., temperature shift or chemical induction) andcells are cultured for an additional period. In other embodiments of thepresent invention, cells are typically harvested by centrifugation,disrupted by physical or chemical means, and the resulting crude extractretained for further purification. In still other embodiments of thepresent invention, microbial cells employed in expression of proteinscan be disrupted by any convenient method, including freeze-thawcycling, sonication, mechanical disruption, or use of cell lysingagents.

3. Purification of NPHP4

[0162] The present invention also provides methods for recovering andpurifying NPHP4 from recombinant cell cultures including, but notlimited to, ammonium sulfate or ethanol precipitation, acid extraction,anion or cation exchange chromatography, phosphocellulosechromatography, hydrophobic interaction chromatography, affinitychromatography, hydroxylapatite chromatography and lectinchromatography. In other embodiments of the present invention,protein-refolding steps can be used as necessary, in completingconfiguration of the mature protein. In still other embodiments of thepresent invention, high performance liquid chromatography (HPLC) can beemployed for final purification steps.

[0163] The present invention further provides polynucleotides having thecoding sequence (e.g., SEQ ID NO: 1) fused in frame to a marker sequencethat allows for purification of the polypeptide of the presentinvention. A non-limiting example of a marker sequence is ahexahistidine tag which may be supplied by a vector, preferably a pQE-9vector, which provides for purification of the polypeptide fused to themarker in the case of a bacterial host, or, for example, the markersequence may be a hemagglutinin (HA) tag when a mammalian host (e.g.,COS-7 cells) is used. The HA tag corresponds to an epitope derived fromthe influenza hemagglutinin protein (Wilson et al., Cell, 37:767[1984]).

4. Truncation Mutants of NPHP4

[0164] In addition, the present invention provides fragments of NPHP4(i.e., truncation mutants, e.g., SEQ ID NOs: 6, 10, 12, 14, 16, and 20).As described above, truncations of NPHP4 were found in families withNPHP type 4 disease. In some embodiments of the present invention, whenexpression of a portion of the NPHP4 protein is desired, it may benecessary to add a start codon (ATG) to the oligonucleotide fragmentcontaining the desired sequence to be expressed. It is well known in theart that a methionine at the N-terminal position can be enzymaticallycleaved by the use of the enzyme methionine aminopeptidase (MAP). MAPhas been cloned from E. coli (Ben-Bassat et al., J. Bacteriol., 169:751[1987]) and Salmonella typhimurium and its in vitro activity has beendemonstrated on recombinant proteins (Miller et al., Proc. Natl. Acad.Sci. USA 84:2718 [1990]). Therefore, removal of an N-terminalmethionine, if desired, can be achieved either in vivo by expressingsuch recombinant polypeptides in a host which produces MAP (e.g., E.coli or CM89 or S. cerivisiae), or in vitro by use of purified MAP.

5. Fusion Proteins Containing NPHP4

[0165] The present invention also provides fusion proteins incorporatingall or part of NPHP4. Accordingly, in some embodiments of the presentinvention, the coding sequences for the polypeptide can be incorporatedas a part of a fusion gene including a nucleotide sequence encoding adifferent polypeptide. It is contemplated that this type of expressionsystem will find use under conditions where it is desirable to producean immunogenic fragment of a NPHP4 protein. In some embodiments of thepresent invention, the VP6 capsid protein of rotavirus is used as animmunologic carrier protein for portions of the NPHP4 polypeptide,either in the monomeric form or in the form of a viral particle. Inother embodiments of the present invention, the nucleic acid sequencescorresponding to the portion of NPHP4 against which antibodies are to beraised can be incorporated into a fusion gene construct which includescoding sequences for a late vaccinia virus structural protein to producea set of recombinant viruses expressing fusion proteins comprising aportion of NPHP4 as part of the virion. It has been demonstrated withthe use of immunogenic fusion proteins utilizing the hepatitis B surfaceantigen fusion proteins that recombinant hepatitis B virions can beutilized in this role as well. Similarly, in other embodiments of thepresent invention, chimeric constructs coding for fusion proteinscontaining a portion of NPHP4 and the poliovirus capsid protein arecreated to enhance immunogenicity of the set of polypeptide antigens(See e.g., EP Publication No. 025949; and Evans et al., Nature 339:385[1989]; Huang et al., J. Virol., 62:3855 [1988]; and Schlienger et al.,J. Virol., 66:2 [1992]).

[0166] In still other embodiments of the present invention, the multipleantigen peptide system for peptide-based immunization can be utilized.In this system, a desired portion of NPHP4 is obtained directly fromorgano-chemical synthesis of the peptide onto an oligomeric branchinglysine core (see e.g., Posnett et al., J. Biol. Chem., 263:1719 [1988];and Nardelli et al., J. Immunol., 148:914 [1992]). In other embodimentsof the present invention, antigenic determinants of the NPHP4 proteinscan also be expressed and presented by bacterial cells.

[0167] In addition to utilizing fusion proteins to enhanceimmunogenicity, it is widely appreciated that fusion proteins can alsofacilitate the expression of proteins, such as the NPHP4 protein of thepresent invention. Accordingly, in some embodiments of the presentinvention, NPHP4 can be generated as a glutathione-S-transferase (i.e.,GST fusion protein). It is contemplated that such GST fusion proteinswill enable easy purification of NPHP4, such as by the use ofglutathione-derivatized matrices (See e.g., Ausabel et al. (eds.),Current Protocols in Molecular Biology, John Wiley & Sons, NY [1991]).In another embodiment of the present invention, a fusion gene coding fora purification leader sequence, such as a poly-(His)/enterokinasecleavage site sequence at the N-terminus of the desired portion ofNPHP4, can allow purification of the expressed NPHP4 fusion protein byaffinity chromatography using a Ni²⁺ metal resin. In still anotherembodiment of the present invention, the purification leader sequencecan then be subsequently removed by treatment with enterokinase (Seee.g., Hochuli et al., J. Chromatogr., 411:177 [1987]; and Janknecht etal., Proc. Natl. Acad. Sci. USA 88:8972).

[0168] Techniques for making fusion genes are well known. Essentially,the joining of various DNA fragments coding for different polypeptidesequences is performed in accordance with conventional techniques,employing blunt-ended or stagger-ended termini for ligation, restrictionenzyme digestion to provide for appropriate termini, filling-in ofcohesive ends as appropriate, alkaline phosphatase treatment to avoidundesirable joining, and enzymatic ligation. In another embodiment ofthe present invention, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, in other embodiments of the present invention, PCRamplification of gene fragments can be carried out using anchor primerswhich give rise to complementary overhangs between two consecutive genefragments which can subsequently be annealed to generate a chimeric genesequence (See e.g., Current Protocols in Molecular Biology, supra).

6. Variants of NPHP4

[0169] Still other embodiments of the present invention provide mutantor variant forms of NPHP4 (i.e., muteins). It is possible to modify thestructure of a peptide having an activity of NPHP4 for such purposes asenhancing therapeutic or prophylactic efficacy, or stability (e.g., exvivo shelf life, and/or resistance to proteolytic degradation in vivo).Such modified peptides are considered functional equivalents of peptideshaving an activity of the subject NPHP4 proteins as defined herein. Amodified peptide can be produced in which the amino acid sequence hasbeen altered, such as by amino acid substitution, deletion, or addition.

[0170] Moreover, as described above, variant forms (e.g., mutants orpolymorphic sequences) of the subject NPHP4 proteins are alsocontemplated as being equivalent to those peptides and DNA moleculesthat are set forth in more detail. For example, as described above, thepresent invention encompasses mutant and variant proteins that containconservative or non-conservative amino acid substitutions.

[0171] This invention further contemplates a method of generating setsof combinatorial mutants of the present NPHP4 proteins, as well astruncation mutants, and is especially useful for identifying potentialvariant sequences (i.e., mutants or polymorphic sequences) that areinvolved in kidney disease or resistance to kidney disease. The purposeof screening such combinatorial libraries is to generate, for example,novel NPHP4 variants that can act as either agonists or antagonists, oralternatively, possess novel activities all together.

[0172] Therefore, in some embodiments of the present invention, NPHP4variants are engineered by the present method to provide altered (e.g.,increased or decreased) biological activity. In other embodiments of thepresent invention, combinatorially-derived variants are generated whichhave a selective potency relative to a naturally occurring NPHP4. Suchproteins, when expressed from recombinant DNA constructs, can be used ingene therapy protocols.

[0173] Still other embodiments of the present invention provide NPHP4variants that have intracellular half-lives dramatically different thanthe corresponding wild-type protein. For example, the altered proteincan be rendered either more stable or less stable to proteolyticdegradation or other cellular process that result in destruction of, orotherwise inactivate NPHP4. Such variants, and the genes which encodethem, can be utilized to alter the location of NPHP4 expression bymodulating the half-life of the protein. For instance, a short half-lifecan give rise to more transient NPHP4 biological effects and, when partof an inducible expression system, can allow tighter control of NPHP4levels within the cell. As above, such proteins, and particularly theirrecombinant nucleic acid constructs, can be used in gene therapyprotocols.

[0174] In still other embodiments of the present invention, NPHP4variants are generated by the combinatorial approach to act asantagonists, in that they are able to interfere with the ability of thecorresponding wild-type protein to regulate cell function.

[0175] In some embodiments of the combinatorial mutagenesis approach ofthe present invention, the amino acid sequences for a population ofNPHP4 homologs, variants or other related proteins are aligned,preferably to promote the highest homology possible. Such a populationof variants can include, for example, NPHP4 homologs from one or morespecies, or NPHP4 variants from the same species but which differ due tomutation or polymorphisms. Amino acids that appear at each position ofthe aligned sequences are selected to create a degenerate set ofcombinatorial sequences.

[0176] In a preferred embodiment of the present invention, thecombinatorial NPHP4 library is produced by way of a degenerate libraryof genes encoding a library of polypeptides which each include at leasta portion of potential NPHP4 protein sequences. For example, a mixtureof synthetic oligonucleotides can be enzymatically ligated into genesequences such that the degenerate set of potential NPHP4 sequences areexpressible as individual polypeptides, or alternatively, as a set oflarger fusion proteins (e.g., for phage display) containing the set ofNPHP4 sequences therein.

[0177] There are many ways by which the library of potential NPHP4homologs and variants can be generated from a degenerate oligonucleotidesequence. In some embodiments, chemical synthesis of a degenerate genesequence is carried out in an automatic DNA synthesizer, and thesynthetic genes are ligated into an appropriate gene for expression. Thepurpose of a degenerate set of genes is to provide, in one mixture, allof the sequences encoding the desired set of potential NPHP4 sequences.The synthesis of degenerate oligonucleotides is well known in the art(See e.g., Narang, Tetrahedron Lett., 39:39 [1983]; Itakura et al.,Recombinant DNA, in Walton (ed.), Proceedings of the 3rd ClevelandSymposium on Macromolecules, Elsevier, Amsterdam, pp 273-289 [1981];Itakura et al., Annu. Rev. Biochem., 53:323 [1984]; Itakura et al.,Science 198:1056 [1984]; Ike et al., Nucl. Acid Res., 11:477 [1983]).Such techniques have been employed in the directed evolution of otherproteins (See e.g., Scott et al., Science 249:386 [1980]; Roberts etal., Proc. Natl. Acad. Sci. USA 89:2429 [1992]; Devlin et al., Science249: 404 [1990]; Cwirla et al., Proc. Natl. Acad. Sci. USA 87: 6378[1990]; each of which is herein incorporated by reference; as well asU.S. Pat. Nos. 5,223,409, 5,198,346, and 5,096,815; each of which isincorporated herein by reference).

[0178] It is contemplated that the NPHP4 nucleic acids (e.g., SEQ IDNO:1, and fragments and variants thereof) can be utilized as startingnucleic acids for directed evolution. These techniques can be utilizedto develop NPHP4 variants having desirable properties such as increasedor decreased biological activity.

[0179] In some embodiments, artificial evolution is performed by randommutagenesis (e.g., by utilizing error-prone PCR to introduce randommutations into a given coding sequence). This method requires that thefrequency of mutation be finely tuned. As a general rule, beneficialmutations are rare, while deleterious mutations are common. This isbecause the combination of a deleterious mutation and a beneficialmutation often results in an inactive enzyme. The ideal number of basesubstitutions for targeted gene is usually between 1.5 and 5 (Moore andArnold, Nat. Biotech., 14, 458 [1996]; Leung et al., Technique, 1:11[1989]; Eckert and Kunkel, PCR Methods Appl., 1:17-24 [1991]; Caldwelland Joyce, PCR Methods Appl., 2:28 [1992]; and Zhao and Arnold, Nuc.Acids. Res., 25:1307 [1997]). After mutagenesis, the resulting clonesare selected for desirable activity (e.g., screened for NPHP4 activity).Successive rounds of mutagenesis and selection are often necessary todevelop enzymes with desirable properties. It should be noted that onlythe useful mutations are carried over to the next round of mutagenesis.

[0180] In other embodiments of the present invention, thepolynucleotides of the present invention are used in gene shuffling orsexual PCR procedures (e.g., Smith, Nature, 370:324 [1994]; U.S. Pat.Nos. 5,837,458; 5,830,721; 5,811,238; 5,733,731; all of which are hereinincorporated by reference). Gene shuffling involves random fragmentationof several mutant DNAs followed by their reassembly by PCR into fulllength molecules. Examples of various gene shuffling procedures include,but are not limited to, assembly following DNase treatment, thestaggered extension process (STEP), and random priming in vitrorecombination. In the DNase mediated method, DNA segments isolated froma pool of positive mutants are cleaved into random fragments with DNaseIand subjected to multiple rounds of PCR with no added primer. Thelengths of random fragments approach that of the uncleaved segment asthe PCR cycles proceed, resulting in mutations in present in differentclones becoming mixed and accumulating in some of the resultingsequences. Multiple cycles of selection and shuffling have led to thefunctional enhancement of several enzymes (Stemmer, Nature, 370:398[1994]; Stemmer, Proc. Natl. Acad. Sci. USA, 91:10747 [1994]; Crameri etal., Nat. Biotech., 14:315 [1996]; Zhang et al., Proc. Natl. Acad. Sci.USA, 94:4504 [1997]; and Crameri et al., Nat. Biotech., 15:436 [1997]).Variants produced by directed evolution can be screened for NPHP4activity by the methods described herein.

[0181] A wide range of techniques are known in the art for screeninggene products of combinatorial libraries made by point mutations, andfor screening cDNA libraries for gene products having a certainproperty. Such techniques will be generally adaptable for rapidscreening of the gene libraries generated by the combinatorialmutagenesis or recombination of NPHP4 homologs or variants. The mostwidely used techniques for screening large gene libraries typicallycomprises cloning the gene library into replicable expression vectors,transforming appropriate cells with the resulting library of vectors,and expressing the combinatorial genes under conditions in whichdetection of a desired activity facilitates relatively easy isolation ofthe vector encoding the gene whose product was detected.

7. Chemical Synthesis of NPHP4

[0182] In an alternate embodiment of the invention, the coding sequenceof NPHP4 is synthesized, whole or in part, using chemical methods wellknown in the art (See e.g., Caruthers et al., Nucl. Acids Res. Symp.Ser., 7:215 [1980]; Crea and Horn, Nucl. Acids Res., 9:2331 [1980];Matteucci and Caruthers, Tetrahedron Lett., 21:719 [1980]; and Chow andKempe, Nucl. Acids Res., 9:2807 [1981]). In other embodiments of thepresent invention, the protein itself is produced using chemical methodsto synthesize either an entire NPHP4 amino acid sequence or a portionthereof. For example, peptides can be synthesized by solid phasetechniques, cleaved from the resin, and purified by preparative highperformance liquid chromatography (See e.g., Creighton, ProteinsStructures And Molecular Principles, W H Freeman and Co, New York N.Y.[1983]). In other embodiments of the present invention, the compositionof the synthetic peptides is confirmed by amino acid analysis orsequencing (See e.g., Creighton, supra).

[0183] Direct peptide synthesis can be performed using varioussolid-phase techniques (Roberge et al., Science 269:202 [1995]) andautomated synthesis maybe achieved, for example, using ABI 431A PeptideSynthesizer (Perkin Elmer) in accordance with the instructions providedby the manufacturer. Additionally, the amino acid sequence of NPHP4, orany part thereof, may be altered during direct synthesis and/or combinedusing chemical methods with other sequences to produce a variantpolypeptide.

III. Detection of NPHP4 and Inversin Alleles

[0184] In some embodiments, the present invention provides methods ofdetecting the presence of wild type or variant (e.g., mutant orpolymorphic) NPHP4 nucleic acids or polypeptides and inversin nucleicacids and polypeptides. The detection of mutant NPHP4 polypeptides andinversin polypeptides finds use in the diagnosis of disease (e.g., NPHPtype 4 or type 2 disease).

A. NPHP4 and Inversin Alleles

[0185] In some embodiments, the present invention includes alleles ofNPHP4 and inversin that increase a patient's susceptibility to NPHP type4 or type 2 kidney disease (e.g., including, but not limited to, SEQ IDNOs: 5, 7, 9, 11, 13, 15, 17, 19, 23, 25, 27, 29, 33, 35, 37, and 39;also see Example 1 and Example 2). However, the present invention is notlimited to the mutations described in SEQ ID NOs: 5, 7, 9, 11, 13, 15,17, 19, 23, 25, 27, 29, 33, 35, 37, and 39. Any mutation that results inthe undesired phenotype (e.g., kidney disease) is within the scope ofthe present invention.

B. Detection of NPHP4 and Inversin Alleles

[0186] Accordingly, the present invention provides methods fordetermining whether a patient has an increased susceptibility NPHP type4 or type 2 kidney disease by determining whether the individual has avariant NPHP4 allele or inversin allele, respectively. In otherembodiments, the present invention provides methods for providing aprognosis of increased risk for kidney disease to an individual based onthe presence or absence of one or more variant alleles of NPHP4 orinversin. In preferred embodiments, the variation causes a truncation ofthe NPHP4 protein or inversin protein.

[0187] A number of methods are available for analysis of variant (e.g.,mutant or polymorphic) nucleic acid sequences. Assays for detectionvariants (e.g., polymorphisms or mutations) fall into severalcategories, including, but not limited to direct sequencing assays,fragment polymorphism assays, hybridization assays, and computer baseddata analysis. Protocols and commercially available kits or services forperforming multiple variations of these assays are available. In someembodiments, assays are performed in combination or in hybrid (e.g.,different reagents or technologies from several assays are combined toyield one assay). The following assays are useful in the presentinvention.

1. Direct Sequencing Assays

[0188] In some embodiments of the present invention, variant sequencesare detected using a direct sequencing technique. In these assays, DNAsamples are first isolated from a subject using any suitable method. Insome embodiments, the region of interest is cloned into a suitablevector and amplified by growth in a host cell (e.g., a bacteria). Inother embodiments, DNA in the region of interest is amplified using PCR.

[0189] Following amplification, DNA in the region of interest (e.g., theregion containing the SNP or mutation of interest) is sequenced usingany suitable method, including but not limited to manual sequencingusing radioactive marker nucleotides, or automated sequencing. Theresults of the sequencing are displayed using any suitable method. Thesequence is examined and the presence or absence of a given SNP ormutation is determined.

2. PCR Assay

[0190] In some embodiments of the present invention, variant sequencesare detected using a PCR-based assay. In some embodiments, the PCR assaycomprises the use of oligonucleotide primers that hybridize only to thevariant or wild type allele of NPHP4 or inversin (e.g., to the region ofpolymorphism or mutation). Both sets of primers are used to amplify asample of DNA. If only the mutant primers result in a PCR product, thenthe patient has the mutant NPHP4 allele. If only the wild-type primersresult in a PCR product, then the patient has the wild type allele ofNPHP4 or inversin.

3. Mutational Detection by dHPLC

[0191] In some embodiments of the present invention, variant sequencesare detected using a PCR-based assay with consecutive detection ofnucleotide variants by dHPLC (denaturing high performance liquidchromatography). Exemplary systems and Methods for dHPLC include, butare not limited to, WAVE (Transgenomic, Inc; Omaha, Nebr.) or VARIANequipment (Palo Alto, Calif.).

4. Fragment Length Polymorphism Assays

[0192] In some embodiments of the present invention, variant sequencesare detected using a fragment length polymorphism assay. In a fragmentlength polymorphism assay, a unique DNA banding pattern based oncleaving the DNA at a series of positions is generated using an enzyme(e.g., a restriction enzyme or a CLEAVASE I [Third Wave Technologies,Madison, Wis.] enzyme). DNA fragments from a sample containing a SNP ora mutation will have a different banding pattern than wild type.

a. RFLP Assay

[0193] In some embodiments of the present invention, variant sequencesare detected using a restriction fragment length polymorphism assay(RFLP). The region of interest is first isolated using PCR. The PCRproducts are then cleaved with restriction enzymes known to give aunique length fragment for a given polymorphism. The restriction-enzymedigested PCR products are separated by agarose gel electrophoresis andvisualized by ethidium bromide staining. The length of the fragments iscompared to molecular weight markers and fragments generated fromwild-type and mutant controls.

b. CFLP Assay

[0194] In other embodiments, variant sequences are detected using aCLEAVASE fragment length polymorphism assay (CFLP; Third WaveTechnologies, Madison, Wis.; See e.g., U.S. Pat. Nos. 5,843,654;5,843,669; 5,719,208; and 5,888,780; each of which is hereinincorporated by reference). This assay is based on the observation thatwhen single strands of DNA fold on themselves, they assume higher orderstructures that are highly individual to the precise sequence of the DNAmolecule. These secondary structures involve partially duplexed regionsof DNA such that single stranded regions are juxtaposed with doublestranded DNA hairpins. The CLEAVASE I enzyme, is a structure-specific,thermostable nuclease that recognizes and cleaves the junctions betweenthese single-stranded and double-stranded regions.

[0195] The region of interest is first isolated, for example, using PCR.Then, DNA strands are separated by heating. Next, the reactions arecooled to allow intrastrand secondary structure to form. The PCRproducts are then treated with the CLEAVASE I enzyme to generate aseries of fragments that are unique to a given SNP or mutation. TheCLEAVASE enzyme treated PCR products are separated and detected (e.g.,by agarose gel electrophoresis) and visualized (e.g., by ethidiumbromide staining). The length of the fragments is compared to molecularweight markers and fragments generated from wild-type and mutantcontrols.

5. Hybridization Assays

[0196] In preferred embodiments of the present invention, variantsequences are detected a hybridization assay. In a hybridization assay,the presence of absence of a given SNP or mutation is determined basedon the ability of the DNA from the sample to hybridize to acomplementary DNA molecule (e.g., a oligonucleotide probe). A variety ofhybridization assays using a variety of technologies for hybridizationand detection are available. A description of a selection of assays isprovided below.

a. Direct Detection of Hybridization

[0197] In some embodiments, hybridization of a probe to the sequence ofinterest (e.g., a SNP or mutation) is detected directly by visualizing abound probe (e.g., a Northern or Southern assay; See e.g., Ausabel etal. (eds.), Current Protocols in Molecular Biology, John Wiley & Sons,NY [1991]). In a these assays, genomic DNA (Southern) or RNA (Northern)is isolated from a subject. The DNA or RNA is then cleaved with a seriesof restriction enzymes that cleave infrequently in the genome and notnear any of the markers being assayed. The DNA or RNA is then separated(e.g., on an agarose gel) and transferred to a membrane. A labeled(e.g., by incorporating a radionucleotide) probe or probes specific forthe SNP or mutation being detected is allowed to contact the membraneunder a condition or low, medium, or high stringency conditions. Unboundprobe is removed and the presence of binding is detected by visualizingthe labeled probe.

b. Detection of Hybridization Using “DNA Chip” Assays

[0198] In some embodiments of the present invention, variant sequencesare detected using a DNA chip hybridization assay. In this assay, aseries of oligonucleotide probes are affixed to a solid support. Theoligonucleotide probes are designed to be unique to a given SNP ormutation. The DNA sample of interest is contacted with the DNA “chip”and hybridization is detected.

[0199] In some embodiments, the DNA chip assay is a GeneChip(Affymetrix, Santa Clara, Calif.; See e.g., U.S. Pat. Nos. 6,045,996;5,925,525; and 5,858,659; each of which is herein incorporated byreference) assay. The GeneChip technology uses miniaturized,high-density arrays of oligonucleotide probes affixed to a “chip.” Probearrays are manufactured by Affymetrix's light-directed chemicalsynthesis process, which combines solid-phase chemical synthesis withphotolithographic fabrication techniques employed in the semiconductorindustry. Using a series of photolithographic masks to define chipexposure sites, followed by specific chemical synthesis steps, theprocess constructs high-density arrays of oligonucleotides, with eachprobe in a predefined position in the array. Multiple probe arrays aresynthesized simultaneously on a large glass wafer. The wafers are thendiced, and individual probe arrays are packaged in injection-moldedplastic cartridges, which protect them from the environment and serve aschambers for hybridization.

[0200] The nucleic acid to be analyzed is isolated, amplified by PCR,and labeled with a fluorescent reporter group. The labeled DNA is thenincubated with the array using a fluidics station. The array is theninserted into the scanner, where patterns of hybridization are detected.The hybridization data are collected as light emitted from thefluorescent reporter groups already incorporated into the target, whichis bound to the probe array. Probes that perfectly match the targetgenerally produce stronger signals than those that have mismatches.Since the sequence and position of each probe on the array are known, bycomplementarity, the identity of the target nucleic acid applied to theprobe array can be determined.

[0201] In other embodiments, a DNA microchip containing electronicallycaptured probes (Nanogen, San Diego, Calif.) is utilized (See e.g., U.S.Pat. Nos. 6,017,696; 6,068,818; and 6,051,380; each of which are hereinincorporated by reference). Through the use of microelectronics,Nanogen's technology enables the active movement and concentration ofcharged molecules to and from designated test sites on its semiconductormicrochip. DNA capture probes unique to a given SNP or mutation areelectronically placed at, or “addressed” to, specific sites on themicrochip. Since DNA has a strong negative charge, it can beelectronically moved to an area of positive charge.

[0202] First, a test site or a row of test sites on the microchip iselectronically activated with a positive charge. Next, a solutioncontaining the DNA probes is introduced onto the microchip. Thenegatively charged probes rapidly move to the positively charged sites,where they concentrate and are chemically bound to a site on themicrochip. The microchip is then washed and another solution of distinctDNA probes is added until the array of specifically bound DNA probes iscomplete.

[0203] A test sample is then analyzed for the presence of target DNAmolecules by determining which of the DNA capture probes hybridize, withcomplementary DNA in the test sample (e.g., a PCR amplified gene ofinterest). An electronic charge is also used to move and concentratetarget molecules to one or more test sites on the microchip. Theelectronic concentration of sample DNA at each test site promotes rapidhybridization of sample DNA with complementary capture probes(hybridization may occur in minutes). To remove any unbound ornonspecifically bound DNA from each site, the polarity or charge of thesite is reversed to negative, thereby forcing any unbound ornonspecifically bound DNA back into solution away from the captureprobes. A laser-based fluorescence scanner is used to detect binding,

[0204] In still further embodiments, an array technology based upon thesegregation of fluids on a flat surface (chip) by differences in surfacetension (ProtoGene, Palo Alto, Calif.) is utilized (See e.g., U.S. Pat.Nos. 6,001,311; 5,985,551; and 5,474,796; each of which is hereinincorporated by reference). Protogene's technology is based on the factthat fluids can be segregated on a flat surface by differences insurface tension that have been imparted by chemical coatings. Once sosegregated, oligonucleotide probes are synthesized directly on the chipby ink-jet printing of reagents. The array with its reaction sitesdefined by surface tension is mounted on a X/Y translation stage under aset of four piezoelectric nozzles, one for each of the four standard DNAbases. The translation stage moves along each of the rows of the arrayand the appropriate reagent is delivered to each of the reaction site.For example, the A amidite is delivered only to the sites where amiditeA is to be coupled during that synthesis step and so on. Common reagentsand washes are delivered by flooding the entire surface and thenremoving them by spinning.

[0205] DNA probes unique for the SNP or mutation of interest are affixedto the chip using Protogene's technology. The chip is then contactedwith the PCR-amplified genes of interest. Following hybridization,unbound DNA is removed and hybridization is detected using any suitablemethod (e.g., by fluorescence de-quenching of an incorporatedfluorescent group).

[0206] In yet other embodiments, a “bead array” is used for thedetection of polymorphisms (Illumina, San Diego, Calif.; See e.g., PCTPublications WO 99/67641 and WO 00/39587, each of which is hereinincorporated by reference). Illumina uses a BEAD ARRAY technology thatcombines fiber optic bundles and beads that self-assemble into an array.Each fiber optic bundle contains thousands to millions of individualfibers depending on the diameter of the bundle. The beads are coatedwith an oligonucleotide specific for the detection of a given SNP ormutation. Batches of beads are combined to form a pool specific to thearray. To perform an assay, the BEAD ARRAY is contacted with a preparedsubject sample (e.g., DNA). Hybridization is detected using any suitablemethod.

C. Enzymatic Detection of Hybridization

[0207] In some embodiments of the present invention, hybridization isdetected by enzymatic cleavage of specific structures (INVADER assay,Third Wave Technologies; See e.g., U.S. Pat. Nos. 5,846,717, 6,090,543;6,001,567; 5,985,557; and 5,994,069; each of which is hereinincorporated by reference). The INVADER assay detects specific DNA andRNA sequences by using structure-specific enzymes to cleave a complexformed by the hybridization of overlapping oligonucleotide probes.Elevated temperature and an excess of one of the probes enable multipleprobes to be cleaved for each target sequence present withouttemperature cycling. These cleaved probes then direct cleavage of asecond labeled probe. The secondary probe oligonucleotide can be 5′-endlabeled with fluorescein that is quenched by an internal dye. Uponcleavage, the de-quenched fluorescein labeled product may be detectedusing a standard fluorescence plate reader.

[0208] The INVADER assay detects specific mutations and SNPs inunamplified genomic DNA. The isolated DNA sample is contacted with thefirst probe specific either for a SNP/mutation or wild type sequence andallowed to hybridize. Then a secondary probe, specific to the firstprobe, and containing the fluorescein label, is hybridized and theenzyme is added. Binding is detected by using a fluorescent plate readerand comparing the signal of the test sample to known positive andnegative controls.

[0209] In some embodiments, hybridization of a bound probe is detectedusing a TaqMan assay (PE Biosystems, Foster City, Calif.; See e.g., U.S.Pat. Nos. 5,962,233 and 5,538,848, each of which is herein incorporatedby reference). The assay is performed during a PCR reaction. The TaqManassay exploits the 5′-3′ exonuclease activity of the AMPLITAQ GOLD DNApolymerase. A probe, specific for a given allele or mutation, isincluded in the PCR reaction. The probe consists of an oligonucleotidewith a 5′-reporter dye (e.g., a fluorescent dye) and a 3′-quencher dye.During PCR, if the probe is bound to its target, the 5′-3′ nucleolyticactivity of the AMPLITAQ GOLD polymerase cleaves the probe between thereporter and the quencher dye. The separation of the reporter dye fromthe quencher dye results in an increase of fluorescence. The signalaccumulates with each cycle of PCR and can be monitored with afluorimeter.

[0210] In still further embodiments, polymorphisms are detected usingthe SNP-IT primer extension assay (Orchid Biosciences, Princeton, N.J.;See e.g., U.S. Pat. Nos. 5,952,174 and 5,919,626, each of which isherein incorporated by reference). In this assay, SNPs are identified byusing a specially synthesized DNA primer and a DNA polymerase toselectively extend the DNA chain by one base at the suspected SNPlocation. DNA in the region of interest is amplified and denatured.Polymerase reactions are then performed using miniaturized systemscalled microfluidics. Detection is accomplished by adding a label to thenucleotide suspected of being at the SNP or mutation location.Incorporation of the label into the DNA can be detected by any suitablemethod (e.g., if the nucleotide contains a biotin label, detection isvia a fluorescently labeled antibody specific for biotin).

6. Mass Spectroscopy Assay

[0211] In some embodiments, a MassARRAY system (Sequenom, San Diego,Calif.) is used to detect variant sequences (See e.g., U.S. Pat. Nos.6,043,031; 5,777,324; and 5,605,798; each of which is hereinincorporated by reference). DNA is isolated from blood samples usingstandard procedures. Next, specific DNA regions containing the mutationor SNP of interest, about 200 base pairs in length, are amplified byPCR. The amplified fragments are then attached by one strand to a solidsurface and the non-immobilized strands are removed by standarddenaturation and washing. The remaining immobilized single strand thenserves as a template for automated enzymatic reactions that producegenotype specific diagnostic products.

[0212] Very small quantities of the enzymatic products, typically fiveto ten nanoliters, are then transferred to a SpectroCHIP array forsubsequent automated analysis with the SpectroREADER mass spectrometer.Each spot is preloaded with light absorbing crystals that form a matrixwith the dispensed diagnostic product. The MassARRAY system usesMALDI-TOF (Matrix Assisted Laser Desorption Ionization—Time of Flight)mass spectrometry. In a process known as desorption, the matrix is hitwith a pulse from a laser beam. Energy from the laser beam istransferred to the matrix and it is vaporized resulting in a smallamount of the diagnostic product being expelled into a flight tube. Asthe diagnostic product is charged when an electrical field pulse issubsequently applied to the tube they are launched down the flight tubetowards a detector. The time between application of the electrical fieldpulse and collision of the diagnostic product with the detector isreferred to as the time of flight. This is a very precise measure of theproduct's molecular weight, as a molecule's mass correlates directlywith time of flight with smaller molecules flying faster than largermolecules. The entire assay is completed in less than one thousandth ofa second, enabling samples to be analyzed in a total of 3-5 secondincluding repetitive data collection. The SpectroTYPER software thencalculates, records, compares and reports the genotypes at the rate ofthree seconds per sample.

7. Detection of Variant NPHP4 and Inversin Proteins

[0213] In other embodiments, variant (e.g., truncated) NPHP4polypeptides and inversin polypeptides are detected (e.g., including,but not limited to, those described in SEQ ID NOs: 6, 8, 10, 12, 14, 16,18, 20, 24, 26, 28, 30, 34, 36, 38 and 40). Any suitable method may beused to detect truncated or mutant NPHP4 polypeptides including, but notlimited to, those described below.

a) Cell Free Translation

[0214] For example, in some embodiments, cell-free translation methodsfrom Ambergen, Inc. (Boston, Mass.) are utilized. Ambergen, Inc. hasdeveloped a method for the labeling, detection, quantitation, analysisand isolation of nascent proteins produced in a cell-free or cellulartranslation system without the use of radioactive amino acids or otherradioactive labels. Markers are aminoacylated to tRNA molecules.Potential markers include native amino acids, non-native amino acids,amino acid analogs or derivatives, or chemical moieties. These markersare introduced into nascent proteins from the resulting misaminoacylatedtRNAs during the translation process.

[0215] One application of Ambergen's protein labeling technology is thegel free truncation test (GFTT) assay (See e.g., U.S. Pat. No.6,303,337, herein incorporated by reference). In some embodiments, thisassay is used to screen for truncation mutations in a TSC1 or TSC2protein. In the GFTT assay, a marker (e.g., a fluorophore) is introducedto the nascent protein during translation near the N-terminus of theprotein. A second and different marker (e.g., a fluorophore with adifferent emission wavelength) is introduced to the nascent protein nearthe C-terminus of the protein. The protein is then separated from thetranslation system and the signal from the markers is measured. Acomparison of the measurements from the N and C terminal signalsprovides information on the fraction of the molecules with C-terminaltruncation (i.e., if the normalized signal from the C-terminal marker is50% of the signal from the N-terminal marker, 50% of the molecules havea C-terminal truncation).

b) Antibody Binding

[0216] In still further embodiments of the present invention, antibodies(See below for antibody production) are used to determine if anindividual contains an allele encoding a variant NPHP4 or inversin gene.In preferred embodiments, antibodies are utilized that discriminatebetween variant (i.e., truncated proteins); and wild-type proteins (SEQID NOs: 2 and 22). In some particularly preferred embodiments, theantibodies are directed to the C-terminus of NPHP4 or inversin. Proteinsthat are recognized by the N-terminal, but not the C-terminal antibodyare truncated. In some embodiments, quantitative immunoassays are usedto determine the ratios of C-terminal to N-terminal antibody binding. Inother embodiments, identification of variants of NPHP4 or inversin isaccomplished through the use of antibodies that differentially bind towild type or variant forms of NPHP4 or inversin.

[0217] Antibody binding is detected by techniques known in the art(e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbant assay),“sandwich” immunoassays, immunoradiometric assays, gel diffusionprecipitation reactions, immunodiffusion assays, in situ immunoassays(e.g., using colloidal gold, enzyme or radioisotope labels, forexample), Western blots, precipitation reactions, agglutination assays(e.g., gel agglutination assays, hemagglutination assays, etc.),complement fixation assays, immunofluorescence assays, protein A assays,and immunoelectrophoresis assays, etc.

[0218] In one embodiment, antibody binding is detected by detecting alabel on the primary antibody. In another embodiment, the primaryantibody is detected by detecting binding of a secondary antibody orreagent to the primary antibody. In a further embodiment, the secondaryantibody is labeled. Many methods are known in the art for detectingbinding in an immunoassay and are within the scope of the presentinvention.

[0219] In some embodiments, an automated detection assay is utilized.Methods for the automation of immunoassays include those described inU.S. Pat. Nos. 5,885,530, 4,981,785, 6,159,750, and 5,358,691, each ofwhich is herein incorporated by reference. In some embodiments, theanalysis and presentation of results is also automated. For example, insome embodiments, software that generates a prognosis based on theresult of the immunoassay is utilized.

[0220] In other embodiments, the immunoassay described in U.S. Pat. Nos.5,599,677 and 5,672,480; each of which is herein incorporated byreference.

8. Kits for Analyzing Risk of NPHP Type 4 or Type 2 Disease

[0221] The present invention also provides kits for determining whetheran individual contains a wild-type or variant (e.g., mutant orpolymorphic) allele of NPHP4, inversin, or NPHP3. In some embodiments,the kits are useful for determining whether the subject is at risk ofdeveloping NPHP type 4, type 3 or type 2 disease. The diagnostic kitsare produced in a variety of ways. In some embodiments, the kits containat least one reagent for specifically detecting a mutant NPHP4 allele orprotein. In other embodiments, the kits contain at least one reagent forspecifically detecting a mutant inversin allele or protein. In stillother embodiments, the kits contain at least one reagent forspecifically detecting a mutant NPHP3 allele or protein. In preferredembodiments, the kits contain reagents for detecting a truncation in theNPHP4, inversin or NPHP3 gene. In preferred embodiments, the reagent isa nucleic acid that hybridizes to nucleic acids containing the mutationand that does not bind to nucleic acids that do not contain themutation. In other preferred embodiments, the reagents are primers foramplifying the region of DNA containing the mutation. In still otherembodiments, the reagents are antibodies that preferentially bind eitherthe wild-type or truncated NPHP4, inversin or NPHP3 proteins.

[0222] In some embodiments, the kit contains instructions fordetermining whether the subject is at risk for developing NPHP type 4,type 3 or type 2 disease. In preferred embodiments, the instructionsspecify that risk for developing NPHP type 4, type 3 or type 2 diseaseis determined by detecting the presence or absence of a mutant NPHP4,NPHP3 or inversin allele in the subject, wherein subjects having anmutant (e.g., truncated) allele are at greater risk for NPHP disease.

[0223] The presence or absence of a disease-associated mutation in aNPHP4, NPHP3 or inversin gene can be used to make therapeutic or othermedical decisions. For example, couples with a family history of NPHPmay choose to conceive a child via in vitro fertilization andpre-implantation genetic screening. In this case, fertilized embryos arescreened for mutant (e.g., disease associated) alleles of the NPHP4,NPHP3 or inversin gene and only embryos with wild type alleles areimplanted in the uterus.

[0224] In other embodiments, in utero screening is performed on adeveloping fetus (e.g., amniocentesis or chorionic villi screening). Instill other embodiments, genetic screening of newborn babies or veryyoung children is performed. The early detection of a NPHP4, NPHP3orinversin allele known to be associated with kidney disease allows forearly intervention (e.g., genetic or pharmaceutical therapies).

[0225] In some embodiments, the kits include ancillary reagents such asbuffering agents, nucleic acid stabilizing reagents, protein stabilizingreagents, and signal producing systems (e.g., florescence generatingsystems as Fret systems). The test kit may be packages in any suitablemanner, typically with the elements in a single container or variouscontainers as necessary along with a sheet of instructions for carryingout the test. In some embodiments, the kits also preferably include apositive control sample.

9. Bioinformatics

[0226] In some embodiments, the present invention provides methods ofdetermining an individual's risk of developing NPHP disease based on thepresence of one or more variant alleles of NPHP4, NPHP3 or inversin. Insome embodiments, the analysis of variant data is processed by acomputer using information stored on a computer (e.g., in a database).For example, in some embodiments, the present invention provides abioinformatics research system comprising a plurality of computersrunning a multi-platform object oriented programming language (See e.g.,U.S. Pat. No. 6,125,383; herein incorporated by reference). In someembodiments, one of the computers stores genetics data (e.g., the riskof contacting NPHP type 4, type3 or type 2 disease associated with agiven polymorphism, as well as the sequences). In some embodiments, oneof the computers stores application programs (e.g., for analyzing theresults of detection assays). Results are then delivered to the user(e.g., via one of the computers or via the internet.

[0227] For example, in some embodiments, a computer-based analysisprogram is used to translate the raw data generated by the detectionassay (e.g., the presence, absence, or amount of a given NPHP4 allele orpolypeptide) into data of predictive value for a clinician. Theclinician can access the predictive data using any suitable means. Thus,in some preferred embodiments, the present invention provides thefurther benefit that the clinician, who is not likely to be trained ingenetics or molecular biology, need not understand the raw data. Thedata is presented directly to the clinician in its most useful form. Theclinician is then able to immediately utilize the information in orderto optimize the care of the subject.

[0228] The present invention contemplates any method capable ofreceiving, processing, and transmitting the information to and fromlaboratories conducting the assays, information provides, medicalpersonal, and subjects. For example, in some embodiments of the presentinvention, a sample (e.g., a biopsy or a serum or urine sample) isobtained from a subject and submitted to a profiling service (e.g.,clinical lab at a medical facility, genomic profiling business, etc.),located in any part of the world (e.g., in a country different than thecountry where the subject resides or where the information is ultimatelyused) to generate raw data. Where the sample comprises a tissue or otherbiological sample, the subject may visit a medical center to have thesample obtained and sent to the profiling center, or subjects maycollect the sample themselves (e.g., a urine sample) and directly sendit to a profiling center. Where the sample comprises previouslydetermined biological information, the information may be directly sentto the profiling service by the subject (e.g., an information cardcontaining the information may be scanned by a computer and the datatransmitted to a computer of the profiling center using an electroniccommunication systems). Once received by the profiling service, thesample is processed and a profile is produced (i.e., presence of wildtype or mutant NPHP4, NPHP3 or inversin genes or polypeptides), specificfor the diagnostic or prognostic information desired for the subject.

[0229] The profile data is then prepared in a format suitable forinterpretation by a treating clinician. For example, rather thanproviding raw data, the prepared format may represent a diagnosis orrisk assessment (e.g., likelihood of developing NPHP or a diagnosis ofNPHP) for the subject, along with recommendations for particulartreatment options. The data may be displayed to the clinician by anysuitable method. For example, in some embodiments, the profiling servicegenerates a report that can be printed for the clinician (e.g., at thepoint of care) or displayed to the clinician on a computer monitor.

[0230] In some embodiments, the information is first analyzed at thepoint of care or at a regional facility. The raw data is then sent to acentral processing facility for further analysis and/or to convert theraw data to information useful for a clinician or patient. The centralprocessing facility provides the advantage of privacy (all data isstored in a central facility with uniform security protocols), speed,and uniformity of data analysis. The central processing facility canthen control the fate of the data following treatment of the subject.For example, using an electronic communication system, the centralfacility can provide data to the clinician, the subject, or researchers.

[0231] In some embodiments, the subject is able to directly access thedata using the electronic communication system. The subject may chosefurther intervention or counseling based on the results. In someembodiments, the data is used for research use. For example, the datamay be used to further optimize the inclusion or elimination of markersas useful indicators of a particular condition or stage of disease.

IV. Generation of NPHP4 and Inversin Antibodies

[0232] The present invention provides isolated antibodies or antibodyfragments (e.g., FAB fragments). Antibodies can be generated to allowfor the detection of NPHP4 protein. The antibodies may be prepared usingvarious immunogens. In one embodiment, the immunogen is a human NPHP4peptide to generate antibodies that recognize human NPHP4. Suchantibodies include, but are not limited to polyclonal, monoclonal,chimeric, single chain, Fab fragments, Fab expression libraries, orrecombinant (e.g., chimeric, humanized, etc.) antibodies, as long as itcan recognize the protein. Antibodies can be produced by using a proteinof the present invention as the antigen according to a conventionalantibody or antiserum preparation process.

[0233] Various procedures known in the art may be used for theproduction of polyclonal antibodies directed against NPHP4. For theproduction of antibody, various host animals can be immunized byinjection with the peptide corresponding to the NPHP4 epitope includingbut not limited to rabbits, mice, rats, sheep, goats, etc. In apreferred embodiment, the peptide is conjugated to an immunogeniccarrier (e.g., diphtheria toxoid, bovine serum albumin (BSA), or keyholelimpet hemocyanin (KLH)). Various adjuvants may be used to increase theimmunological response, depending on the host species, including but notlimited to Freund's (complete and incomplete), mineral gels (e.g.,aluminum hydroxide), surface active substances (e.g., lysolecithin,pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpethemocyanins, dinitrophenol, and potentially useful human adjuvants suchas BCG (Bacille Calmette-Guerin) and Corynebacterium parvum).

[0234] For preparation of monoclonal antibodies directed toward NPHP4,it is contemplated that any technique that provides for the productionof antibody molecules by continuous cell lines in culture will find usewith the present invention (See e.g., Harlow and Lane, Antibodies: ALaboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.). These include but are not limited to the hybridomatechnique originally developed by Köhler and Milstein (Köhler andMilstein, Nature 256:495-497 [1975]), as well as the trioma technique,the human B-cell hybridoma technique (See e.g., Kozbor et al., Immunol.Tod., 4:72 [1983]), and the EBV-hybridoma technique to produce humanmonoclonal antibodies (Cole et al., in Monoclonal Antibodies and CancerTherapy, Alan R. Liss, Inc., pp. 77-96 [1985]).

[0235] In an additional embodiment of the invention, monoclonalantibodies are produced in germ-free animals utilizing technology suchas that described in PCT/US90/02545). Furthermore, it is contemplatedthat human antibodies will be generated by human hybridomas (Cote etal., Proc. Natl. Acad. Sci. USA 80:2026-2030 [1983]) or by transforminghuman B cells with EBV virus in vitro (Cole et al., in MonoclonalAntibodies and Cancer Therapy, Alan R. Liss, pp. 77-96 [1985]).

[0236] In addition, it is contemplated that techniques described for theproduction of single chain antibodies (U.S. Pat. No. 4,946,778; hereinincorporated by reference) will find use in producing NPHP4 specificsingle chain antibodies. An additional embodiment of the inventionutilizes the techniques described for the construction of Fab expressionlibraries (Huse et al., Science 246:1275-1281 [1989]) to allow rapid andeasy identification of monoclonal Fab fragments with the desiredspecificity for NPHP4.

[0237] In other embodiments, the present invention contemplatedrecombinant antibodies or fragments thereof to the proteins of thepresent invention. Recombinant antibodies include, but are not limitedto, humanized and chimeric antibodies. Methods for generatingrecombinant antibodies are known in the art (See e.g., U.S. Pat. Nos.6,180,370 and 6,277,969 and “Monoclonal Antibodies” H. Zola, BIOSScientific Publishers Limited 2000. Springer-Verlay New York, Inc., NewYork; each of which is herein incorporated by reference).

[0238] It is contemplated that any technique suitable for producingantibody fragments will find use in generating antibody fragments thatcontain the idiotype (antigen binding region) of the antibody molecule.For example, such fragments include but are not limited to: F(ab′)2fragment that can be produced by pepsin digestion of the antibodymolecule; Fab′ fragments that can be generated by reducing the disulfidebridges of the F(ab′)2 fragment, and Fab fragments that can be generatedby treating the antibody molecule with papain and a reducing agent.

[0239] In the production of antibodies, it is contemplated thatscreening for the desired antibody will be accomplished by techniquesknown in the art (e.g., radioimmunoassay, ELISA (enzyme-linkedimmunosorbant assay), “sandwich” immunoassays, immunoradiometric assays,gel diffusion precipitation reactions, immunodiffusion assays, in situimmunoassays (e.g., using colloidal gold, enzyme or radioisotope labels,for example), Western blots, precipitation reactions, agglutinationassays (e.g., gel agglutination assays, hemagglutination assays, etc.),complement fixation assays, immunofluorescence assays, protein A assays,and immunoelectrophoresis assays, etc.

[0240] In one embodiment, antibody binding is detected by detecting alabel on the primary antibody. In another embodiment, the primaryantibody is detected by detecting binding of a secondary antibody orreagent to the primary antibody. In a further embodiment, the secondaryantibody is labeled. Many means are known in the art for detectingbinding in an immunoassay and are within the scope of the presentinvention. As is well known in the art, the immunogenic peptide shouldbe provided free of the carrier molecule used in any immunizationprotocol. For example, if the peptide was conjugated to KLH, it may beconjugated to BSA, or used directly, in a screening assay.)

[0241] Additionally, using the above methods, antibodies can begenerated that recognize the variant forms of NPHP4 or inversin, whilenot recognizing the wild type forms of the NPHP4 or inversin proteins.

[0242] The foregoing antibodies can be used in methods known in the artrelating to the localization and structure of NPHP4 and inversin (e.g.,for Western blotting, immunoprecipitaion and immunocytochemistry, seeExamples 3-6), measuring levels thereof in appropriate biologicalsamples, etc. The antibodies can be used to detect NPHP4 or inversin ina biological sample from an individual. The biological sample can be abiological fluid, such as, but not limited to, blood, serum, plasma,interstitial fluid, urine, cerebrospinal fluid, and the like, containingcells.

[0243] The biological samples can then be tested directly for thepresence of human NPHP4 using an appropriate strategy (e.g., ELISA orradioimmunoassay) and format (e.g., microwells, dipstick (e.g., asdescribed in International Patent Publication WO 93/03367), etc.Alternatively, proteins in the sample can be size separated (e.g., bypolyacrylamide gel electrophoresis (PAGE), in the presence or not ofsodium dodecyl sulfate (SDS), and the presence of NPHP4 detected byimmunoblotting (Western blotting). Immunoblotting techniques aregenerally more effective with antibodies generated against a peptidecorresponding to an epitope of a protein, and hence, are particularlysuited to the present invention.

[0244] Another method uses antibodies as agents to alter signaltransduction. Specific antibodies that bind to the binding domains ofNPHP4 or inversin or other proteins involved in intracellular signalingcan be used to inhibit the interaction between the various proteins andtheir interaction with other ligands. Antibodies that bind to thecomplex can also be used therapeutically to inhibit interactions of theprotein complex in the signal transduction pathways leading to thevarious physiological and cellular effects of NPHP. Such antibodies canalso be used diagnostically to measure abnormal expression of NPHP4 orinversin, or the aberrant formation of protein complexes, which may beindicative of a disease state.

V. Gene Therapy using NPHP4 and Inversin

[0245] The present invention also provides methods and compositionssuitable for gene therapy to alter NPHP4 or inversin expression,production, or function. As described above, the present inventionprovides human NPHP4 genes and provides methods of obtaining NPHP4 genesfrom other species. Thus, the methods described below are generallyapplicable across many species. In some embodiments, it is contemplatedthat the gene therapy is performed by providing a subject with awild-type allele of NPHP4 or inversin (i.e., an allele that does notcontain a NPHP disease causing polymorphisms or mutations, See Example6). Subjects in need of such therapy are identified by the methodsdescribed above.

[0246] Viral vectors commonly used for in vivo or ex vivo targeting andtherapy procedures are DNA-based vectors and retroviral vectors. Methodsfor constructing and using viral vectors are known in the art (See e.g.,Miller and Rosman, BioTech., 7:980-990 [1992]). Preferably, the viralvectors are replication defective, that is, they are unable to replicateautonomously in the target cell. In general, the genome of thereplication defective viral vectors that are used within the scope ofthe present invention lack at least one region that is necessary for thereplication of the virus in the infected cell. These regions can eitherbe eliminated (in whole or in part), or be rendered non-functional byany technique known to a person skilled in the art. These techniquesinclude the total removal, substitution (by other sequences, inparticular by the inserted nucleic acid), partial deletion or additionof one or more bases to an essential (for replication) region. Suchtechniques may be performed in vitro (i.e., on the isolated DNA) or insitu, using the techniques of genetic manipulation or by treatment withmutagenic agents.

[0247] Preferably, the replication defective virus retains the sequencesof its genome that are necessary for encapsidating the viral particles.DNA viral vectors include an attenuated or defective DNA viruses,including, but not limited to, herpes simplex virus (HSV),papillomavirus, Epstein Barr virus (EBV), adenovirus, adeno-associatedvirus (AAV), and the like. Defective viruses, that entirely or almostentirely lack viral genes, are preferred, as defective virus is notinfective after introduction into a cell. Use of defective viral vectorsallows for administration to cells in a specific, localized area,without concern that the vector can infect other cells. Thus, a specifictissue can be specifically targeted. Examples of particular vectorsinclude, but are not limited to, a defective herpes virus 1 (HSV1)vector (Kaplitt et al., Mol. Cell. Neurosci., 2:320-330 [1991]),defective herpes virus vector lacking a glycoprotein L gene (See e.g.,Patent Publication RD 371005 A), or other defective herpes virus vectors(See e.g., WO 94/21807; and WO 92/05263); an attenuated adenovirusvector, such as the vector described by Stratford-Perricaudet et al. (J.Clin. Invest., 90:626-630 [1992]; See also, La Salle et al., Science259:988-990 [1993]); and a defective adeno-associated virus vector(Samulski et al., J. Virol., 61:3096-3101 [1987]; Samulski et al., J.Virol., 63:3822-3828 [1989]; and Lebkowski et al., Mol. Cell. Biol.,8:3988-3996 [1988]).

[0248] Preferably, for in vivo administration, an appropriateimmunosuppressive treatment is employed in conjunction with the viralvector (e.g., adenovirus vector), to avoid immuno-deactivation of theviral vector and transfected cells. For example, immunosuppressivecytokines, such as interleukin-12 (IL-12), interferon-gamma (IFN-γ), oranti-CD4 antibody, can be administered to block humoral or cellularimmune responses to the viral vectors. In addition, it is advantageousto employ a viral vector that is engineered to express a minimal numberof antigens.

[0249] In a preferred embodiment, the vector is an adenovirus vector.Adenoviruses are eukaryotic DNA viruses that can be modified toefficiently deliver a nucleic acid of the invention to a variety of celltypes. Various serotypes of adenovirus exist. Of these serotypes,preference is given, within the scope of the present invention, to type2 or type 5 human adenoviruses (Ad 2 or Ad 5), or adenoviruses of animalorigin (See e.g., WO 94/26914). Those adenoviruses of animal origin thatcan be used within the scope of the present invention includeadenoviruses of canine, bovine, murine (e.g., Mav1, Beard et al.,Virol., 75-81 [1990]), ovine, porcine, avian, and simian (e.g., SAV)origin. Preferably, the adenovirus of animal origin is a canineadenovirus, more preferably a CAV2 adenovirus (e.g. Manhattan or A26/61strain (ATCC VR-800)).

[0250] Preferably, the replication defective adenoviral vectors of theinvention comprise the ITRs, an encapsidation sequence and the nucleicacid of interest. Still more preferably, at least the E1 region of theadenoviral vector is non-functional. The deletion in the E1 regionpreferably extends from nucleotides 455 to 3329 in the sequence of theAd5 adenovirus (PvuII-BglII fragment) or 382 to 3446 (HinfII-Sau3Afragment). Other regions may also be modified, in particular the E3region (e.g., WO 95/02697), the E2 region (e.g., WO 94/28938), the E4region (e.g., WO 94/28152, WO 94/12649 and WO 95/02697), or in any ofthe late genes L1-L5.

[0251] In a preferred embodiment, the adenoviral vector has a deletionin the E1 region (Ad 1.0). Examples of E1-deleted adenoviruses aredisclosed in EP 185,573, the contents of which are incorporated hereinby reference. In another preferred embodiment, the adenoviral vector hasa deletion in the E1 and E4 regions (Ad 3.0). Examples of E1/E4-deletedadenoviruses are disclosed in WO 95/02697 and WO 96/22378. In stillanother preferred embodiment, the adenoviral vector has a deletion inthe E1 region into which the E4 region and the nucleic acid sequence areinserted.

[0252] The replication defective recombinant adenoviruses according tothe invention can be prepared by any technique known to the personskilled in the art (See e.g., Levrero et al., Gene 101:195 [1991]; EP185 573; and Graham, EMBO J., 3:2917 [1984]). In particular, they can beprepared by homologous recombination between an adenovirus and a plasmidthat carries, inter alia, the DNA sequence of interest. The homologousrecombination is accomplished following co-transfection of theadenovirus and plasmid into an appropriate cell line. The cell line thatis employed should preferably (i) be transformable by the elements to beused, and (ii) contain the sequences that are able to complement thepart of the genome of the replication defective adenovirus, preferablyin integrated form in order to avoid the risks of recombination.Examples of cell lines that may be used are the human embryonic kidneycell line 293 (Graham et al., J. Gen. Virol., 36:59 [1977]), whichcontains the left-hand portion of the genome of an Ad5 adenovirus (12%)integrated into its genome, and cell lines that are able to complementthe E1 and E4 functions, as described in applications WO 94/26914 and WO95/02697. Recombinant adenoviruses are recovered and purified usingstandard molecular biological techniques that are well known to one ofordinary skill in the art.

[0253] The adeno-associated viruses (AAV) are DNA viruses of relativelysmall size that can integrate, in a stable and site-specific manner,into the genome of the cells that they infect. They are able to infect awide spectrum of cells without inducing any effects on cellular growth,morphology or differentiation, and they do not appear to be involved inhuman pathologies. The AAV genome has been cloned, sequenced andcharacterized. It encompasses approximately 4700 bases and contains aninverted terminal repeat (ITR) region of approximately 145 bases at eachend, which serves as an origin of replication for the virus. Theremainder of the genome is divided into two essential regions that carrythe encapsidation functions: the left-hand part of the genome, thatcontains the rep gene involved in viral replication and expression ofthe viral genes; and the right-hand part of the genome, that containsthe cap gene encoding the capsid proteins of the virus.

[0254] The use of vectors derived from the AAVs for transferring genesin vitro and in vivo has been described (See e.g., WO 91/18088; WO93/09239; U.S. Pat. No. 4,797,368; U.S. Pat. No., 5,139,941; and EP 488528, all of which are herein incorporated by reference). Thesepublications describe various AAV-derived constructs in which the repand/or cap genes are deleted and replaced by a gene of interest, and theuse of these constructs for transferring the gene of interest in vitro(into cultured cells) or in vivo (directly into an organism). Thereplication defective recombinant AAVs according to the invention can beprepared by co-transfecting a plasmid containing the nucleic acidsequence of interest flanked by two AAV inverted terminal repeat (ITR)regions, and a plasmid carrying the AAV encapsidation genes (rep and capgenes), into a cell line that is infected with a human helper virus (forexample an adenovirus). The AAV recombinants that are produced are thenpurified by standard techniques.

[0255] In another embodiment, the gene can be introduced in a retroviralvector (e.g., as described in U.S. Pat. Nos. 5,399,346, 4,650,764,4,980,289 and 5,124,263; all of which are herein incorporated byreference; Mann et al., Cell 33:153 [1983]; Markowitz et al., J. Virol.,62:1120 [1988]; PCT/US95/14575; EP 453242; EP178220; Bernstein et al.Genet. Eng., 7:235 [1985]; McCormick, BioTechnol., 3:689 [1985]; WO95/07358; and Kuo et al., Blood 82:845 [1993]). The retroviruses areintegrating viruses that infect dividing cells. The retrovirus genomeincludes two LTRs, an encapsidation sequence and three coding regions(gag, pol and env). In recombinant retroviral vectors, the gag, pol andenv genes are generally deleted, in whole or in part, and replaced witha heterologous nucleic acid sequence of interest. These vectors can beconstructed from different types of retrovirus, such as, HIV, MoMuLV(“murine Moloney leukemia virus” MSV (“murine Moloney sarcoma virus”),HaSV (“Harvey sarcoma virus”); SNV (“spleen necrosis virus”); RSV (“Roussarcoma virus”) and Friend virus. Defective retroviral vectors are alsodisclosed in WO 95/02697.

[0256] In general, in order to construct recombinant retrovirusescontaining a nucleic acid sequence, a plasmid is constructed thatcontains the LTRs, the encapsidation sequence and the coding sequence.This construct is used to transfect a packaging cell line, which cellline is able to supply in trans the retroviral functions that aredeficient in the plasmid. In general, the packaging cell lines are thusable to express the gag, pol and env genes. Such packaging cell lineshave been described in the prior art, in particular the cell line PA317(U.S. Pat. No. 4,861,719, herein incorporated by reference), the PsiCRIPcell line (See, WO90/02806), and the GP+envAm-12 cell line (See,WO89/07150). In addition, the recombinant retroviral vectors can containmodifications within the LTRs for suppressing transcriptional activityas well as extensive encapsidation sequences that may include a part ofthe gag gene (Bender et al., J. Virol., 61:1639 [1987]). Recombinantretroviral vectors are purified by standard techniques known to thosehaving ordinary skill in the art.

[0257] Alternatively, the vector can be introduced in vivo bylipofection. For the past decade, there has been increasing use ofliposomes for encapsulation and transfection of nucleic acids in vitro.Synthetic cationic lipids designed to limit the difficulties and dangersencountered with liposome mediated transfection can be used to prepareliposomes for in vivo transfection of a gene encoding a marker (Felgneret. al., Proc. Natl. Acad. Sci. USA 84:7413-7417 [1987]; See also,Mackey, et al., Proc. Natl. Acad. Sci. USA 85:8027-8031 [1988]; Ulmer etal., Science 259:1745-1748 [1993]). The use of cationic lipids maypromote encapsulation of negatively charged nucleic acids, and alsopromote fusion with negatively charged cell membranes (Felgner andRingold, Science 337:387-388 [1989]). Particularly useful lipidcompounds and compositions for transfer of nucleic acids are describedin WO95/18863 and WO96/17823, and in U.S. Pat. No. 5,459,127, hereinincorporated by reference.

[0258] Other molecules are also useful for facilitating transfection ofa nucleic acid in vivo, such as a cationic oligopeptide (e.g.,WO95/21931), peptides derived from DNA binding proteins (e.g.,WO96/25508), or a cationic polymer (e.g., WO95/21931).

[0259] It is also possible to introduce the vector in vivo as a nakedDNA plasmid. Methods for formulating and administering naked DNA tomammalian muscle tissue are disclosed in U.S. Pat. Nos. 5,580,859 and5,589,466, both of which are herein incorporated by reference.

[0260] DNA vectors for gene therapy can be introduced into the desiredhost cells by methods known in the art, including but not limited totransfection, electroporation, microinjection, transduction, cellfusion, DEAE dextran, calcium phosphate precipitation, use of a genegun, or use of a DNA vector transporter (See e.g., Wu et al., J. Biol.Chem., 267:963 [1992]; Wu and Wu, J. Biol. Chem., 263:14621 [1988]; andWilliams et al., Proc. Natl. Acad. Sci. USA 88:2726 [1991]).Receptor-mediated DNA delivery approaches can also be used (Curiel etal., Hum. Gene Ther., 3:147 [1992]; and Wu and Wu, J. Biol. Chem.,262:4429 [1987]).

VI. Transgenic Animals Expressing Exogenous NPHP4 Genes and Homologs,Mutants, and Variants Thereof

[0261] The present invention contemplates the generation of transgenicanimals comprising an exogenous NPHP4 gene or inversin gene or homologs,mutants, or variants thereof. In preferred embodiments, the transgenicanimal displays an altered phenotype as compared to wild-type animals.In some embodiments, the altered phenotype is the overexpression of mRNAfor a NPHP4 gene or inversin gene as compared to wild-type levels ofNPHP4 or inversin expression. In other embodiments, the alteredphenotype is the decreased expression of mRNA for an endogenous NPHP4gene or inversin gene as compared to wild-type levels of endogenousNPHP4 or inversin expression. In some preferred embodiments, thetransgenic animals comprise mutant (e.g., truncated) alleles of NPHP4 orinversin. Methods for analyzing the presence or absence of suchphenotypes include Northern blotting, mRNA protection assays, andRT-PCR. In other embodiments, the transgenic mice have a knock outmutation of the NPHP4 gene or inversin gene. In preferred embodiments,the transgenic animals display a NPHP disease phenotype.

[0262] Such animals find use in research applications (e.g., identifyingsignaling pathways involved in NPHP), as well as drug screeningapplications (e.g., to screen for drugs that prevents NPHP disease. Forexample, in some embodiments, test compounds (e.g., a drug that issuspected of being useful to treat NPHP disease) and control compounds(e.g., a placebo) are administered to the transgenic animals and thecontrol animals and the effects evaluated. The effects of the test andcontrol compounds on disease symptoms are then assessed.

[0263] The transgenic animals can be generated via a variety of methods.In some embodiments, embryonal cells at various developmental stages areused to introduce transgenes for the production of transgenic animals.Different methods are used depending on the stage of development of theembryonal cell. The zygote is the best target for micro-injection. Inthe mouse, the male pronucleus reaches the size of approximately 20micrometers in diameter, which allows reproducible injection of 1-2picoliters (pl) of DNA solution. The use of zygotes as a target for genetransfer has a major advantage in that in most cases the injected DNAwill be incorporated into the host genome before the first cleavage(Brinster et al., Proc. Natl. Acad. Sci. USA 82:4438-4442 [1985]). As aconsequence, all cells of the transgenic non-human animal will carry theincorporated transgene. This will in general also be reflected in theefficient transmission of the transgene to offspring of the foundersince 50% of the germ cells will harbor the transgene. U.S. Pat. No.4,873,191 describes a method for the micro-injection of zygotes; thedisclosure of this patent is incorporated herein in its entirety.

[0264] In other embodiments, retroviral infection is used to introducetransgenes into a nonhuman animal. In some embodiments, the retroviralvector is utilized to transfect oocytes by injecting the retroviralvector into the perivitelline space of the oocyte (U.S. Pat. No.6,080,912, incorporated herein by reference). In other embodiments, thedeveloping non-human embryo can be cultured in vitro to the blastocyststage. During this time, the blastomeres can be targets for retroviralinfection (Janenich, Proc. Natl. Acad. Sci. USA 73:1260 [1976]).Efficient infection of the blastomeres is obtained by enzymatictreatment to remove the zona pellucida (Hogan et al., in Manipulatingthe Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y. [1986]). The viral vector system used to introduce thetransgene is typically a replication-defective retrovirus carrying thetransgene (Jahner et al., Proc. Natl. Acad Sci. USA 82:6927 [1985]).Transfection is easily and efficiently obtained by culturing theblastomeres on a monolayer of virus-producing cells (Van der Putten,supra; Stewart, et al., EMBO J., 6:383 [1987]). Alternatively, infectioncan be performed at a later stage. Virus or virus-producing cells can beinjected into the blastocoele (Jahner et al., Nature 298:623 [1982]).Most of the founders will be mosaic for the transgene sinceincorporation occurs only in a subset of cells that form the transgenicanimal. Further, the founder may contain various retroviral insertionsof the transgene at different positions in the genome that generallywill segregate in the offspring. In addition, it is also possible tointroduce transgenes into the germline, albeit with low efficiency, byintrauterine retroviral infection of the midgestation embryo (Jahner etal., supra [1982]). Additional means of using retroviruses or retroviralvectors to create transgenic animals known to the art involves themicro-injection of retroviral particles or mitomycin C-treated cellsproducing retrovirus into the perivitelline space of fertilized eggs orearly embryos (PCT International Application WO 90/08832 [1990], andHaskell and Bowen, Mol. Reprod. Dev., 40:386 [1995]).

[0265] In other embodiments, the transgene is introduced into embryonicstem cells and the transfected stem cells are utilized to form anembryo. ES cells are obtained by culturing pre-implantation embryos invitro under appropriate conditions (Evans et al., Nature 292:154 [1981];Bradley et al., Nature 309:255 [1984]; Gossler et al., Proc. Acad. Sci.USA 83:9065 [1986]; and Robertson et al., Nature 322:445 [1986]).Transgenes can be efficiently introduced into the ES cells by DNAtransfection by a variety of methods known to the art including calciumphosphate co-precipitation, protoplast or spheroplast fusion,lipofection and DEAE-dextran-mediated transfection. Transgenes may alsobe introduced into ES cells by retrovirus-mediated transduction or bymicro-injection. Such transfected ES cells can thereafter colonize anembryo following their introduction into the blastocoel of ablastocyst-stage embryo and contribute to the germ line of the resultingchimeric animal (for review, See, Jaenisch, Science 240:1468 [1988]).Prior to the introduction of transfected ES cells into the blastocoel,the transfected ES cells may be subjected to various selection protocolsto enrich for ES cells which have integrated the transgene assuming thatthe transgene provides a means for such selection. Alternatively, thepolymerase chain reaction may be used to screen for ES cells that haveintegrated the transgene. This technique obviates the need for growth ofthe transfected ES cells under appropriate selective conditions prior totransfer into the blastocoel.

[0266] In still other embodiments, homologous recombination is utilizedto knock-out gene function or create deletion mutants (e.g., mutants inwhich the LRRs of NPHP4 are deleted). Methods for homologousrecombination are described in U.S. Pat. No. 5,614,396, incorporatedherein by reference.

VIII. Drug Screening using NPHP4 and Inversin

[0267] As described herein, it is contemplated that nephroretinin,inversin and nephrocystin interact within a novel shared pathogenicpathway (e.g., as shown in Examples 3-5). Accordingly, in someembodiments, the isolated nucleic acid sequences of NPHP4 (e.g., SEQ IDNOS: 1, 5, 7, 9, 11, 13, 15, 17, and 19) and inversin (e.g., SEQ ID Nos:24, 26, 28, 30, 34, 36, 38 and 40) are used in drug screeningapplications for compounds that alter (e.g., enhance) signaling withinthe pathway.

A. Identification of Binding Partners

[0268] In some embodiments, binding partners of NPHP4 amino acids andinversin amino acids are identified. In some embodiments, the NPHP4nucleic acid sequence (e.g., SEQ ID NOS: 1, 5, 7, 9, 11, 13, 15, 17, and19) and inversin nucleic acid sequences (e.g., SEQ ID Nos: 21, 23, 25,27, 29, 33, 35, 37 and 39) or fragments thereof are used in yeasttwo-hybrid screening assays. For example, in some embodiments, thenucleic acid sequences are subcloned into pGPT9 (Clontech, La Jolla,Calif.) to be used as a bait in a yeast-2-hybrid screen forprotein-protein interaction of a human fetal kidney cDNA library (Fieldsand Song Nature 340:245-246, 1989; herein incorporated by reference). Inother embodiments, phage display is used to identify binding partners(Parmley and Smith Gene 73 : 305-318, [1988]; herein incorporated byreference).

B. Drug Screening

[0269] The present invention provides methods and compositions for usingNPHP4 and inversin as a target for screening drugs that can alter, forexample, interaction between NPHP4 and inversin and their bindingpartners (e.g., those identified using the above methods)

[0270] In one screening method, the two-hybrid system is used to screenfor compounds (e.g., drug) capable of altering (e.g., inhibiting) NPHP4function(s) or inversin function(s) (e.g., interaction with a bindingpartner) in vitro or in vivo. In one embodiment, a GAL4 binding site,linked to a reporter gene such as lacZ, is contacted in the presence andabsence of a candidate compound with a GAL4 binding domain linked to aNPHP4 fragment or a inversin fragment and a GAL4 transactivation domainII linked to a binding partner fragment. Expression of the reporter geneis monitored and a decrease in the expression is an indication that thecandidate compound inhibits the interaction of NPHP4 or inversin withthe binding partner. Alternately, the effect of candidate compounds onthe interaction of NPHP4 with other proteins (e.g., proteins known tointeract directly or indirectly with the binding partner) can be testedin a similar manner.

[0271] In another screening method, candidate compounds are evaluatedfor their ability to alter NPHP4 signaling or inversin signaling bycontacting NPHP4 or inversin, binding partners, bindingpartner-associated proteins, or fragments thereof, with the candidatecompound and determining binding of the candidate compound to thepeptide. The protein or protein fragments is/are immobilized usingmethods known in the art such as binding a GST-NPHP4 or a GST-inversinfusion protein to a polymeric bead containing glutathione. A chimericgene encoding a GST fusion protein is constructed by fusing DNA encodingthe polypeptide or polypeptide fragment of interest to the DNA encodingthe carboxyl terminus of GST (See e.g., Smith et al., Gene 67:31[1988]). The fusion construct is then transformed into a suitableexpression system (e.g., E. coli XA90) in which the expression of theGST fusion protein can be induced withisopropyl-β-D-thiogalactopyranoside (IPTG). Induction with IPTG shouldyield the fusion protein as a major constituent of soluble, cellularproteins. The fusion proteins can be purified by methods known to thoseskilled in the art, including purification by glutathione affinitychromatography. Binding of the candidate compound to the proteins orprotein fragments is correlated with the ability of the compound todisrupt the signal transduction pathway and thus regulate NPHP4 orinversin physiological effects (e.g., kidney disease).

[0272] In another screening method, one of the components of the NPHP4orinversin/binding partner signaling system, is immobilized. Polypeptidescan be immobilized using methods known in the art, such as adsorptiononto a plastic microtiter plate or specific binding of a GST-fusionprotein to a polymeric bead containing glutathione. For example,GST-NPHP4 or GST-inversin is bound to glutathione-Sepharose beads. Theimmobilized peptide is then contacted with another peptide with which itis capable of binding in the presence and absence of a candidatecompound. Unbound peptide is then removed and the complex solubilizedand analyzed to determine the amount of bound labeled peptide. Adecrease in binding is an indication that the candidate compoundinhibits the interaction of NPHP4 or inversin with the other peptide. Avariation of this method allows for the screening of compounds that arecapable of disrupting a previously-formed protein/protein complex. Forexample, in some embodiments a complex comprising NPHP4 or inversin orfragments thereof bound to another peptide is immobilized as describedabove and contacted with a candidate compound. The dissolution of thecomplex by the candidate compound correlates with the ability of thecompound to disrupt or inhibit the interaction between NPHP4 or inversinand the other peptide.

[0273] Another technique for drug screening provides high throughputscreening for compounds having suitable binding affinity to NPHP4peptides or inversin peptides and is described in detail in WO 84/03564,incorporated herein by reference. Briefly, large numbers of differentsmall peptide test compounds are synthesized on a solid substrate, suchas plastic pins or some other surface. The peptide test compounds arethen reacted with NPHP4 peptides or inversin peptides and washed. BoundNPHP4 peptides or inversin peptides are then detected by methods wellknown in the art.

[0274] Another technique uses NPHP4 antibodies or inversin antibodies,generated as discussed above. Such antibodies capable of specificallybinding to NPHP4 peptides or inversin peptides compete with a testcompound for binding to NPHP4 or inversin. In this manner, theantibodies can be used to detect the presence of any peptide that sharesone or more antigenic determinants of the NPHP4 peptide or inversinpeptide.

[0275] The present invention contemplates many other means of screeningcompounds. The examples provided above are presented merely toillustrate a range of techniques available. One of ordinary skill in theart will appreciate that many other screening methods can be used.

[0276] In particular, the present invention contemplates the use of celllines transfected with NPHP4 and inversin and variants thereof forscreening compounds for activity, and in particular to high throughputscreening of compounds from combinatorial libraries (e.g., librariescontaining greater than 10⁴ compounds). The cell lines of the presentinvention can be used in a variety of screening methods. In someembodiments, the cells can be used in second messenger assays thatmonitor signal transduction following activation of cell-surfacereceptors. In other embodiments, the cells can be used in reporter geneassays that monitor cellular responses at the transcription/translationlevel. In still further embodiments, the cells can be used in cellproliferation assays to monitor the overall growth/no growth response ofcells to external stimuli.

[0277] In second messenger assays, the host cells are preferablytransfected as described above with vectors encoding NPHP4 or inversinor variants or mutants thereof. The host cells are then treated with acompound or plurality of compounds (e.g., from a combinatorial library)and assayed for the presence or absence of a response. It iscontemplated that at least some of the compounds in the combinatoriallibrary can serve as agonists, antagonists, activators, or inhibitors ofthe protein or proteins encoded by the vectors. It is also contemplatedthat at least some of the compounds in the combinatorial library canserve as agonists, antagonists, activators, or inhibitors of proteinacting upstream or downstream of the protein encoded by the vector in asignal transduction pathway.

[0278] In some embodiments, the second messenger assays measurefluorescent signals from reporter molecules that respond tointracellular changes (e.g., Ca²⁺ concentration, membrane potential, pH,IP₃, cAMP, arachidonic acid release) due to stimulation of membranereceptors and ion channels (e.g., ligand gated ion channels; see Denyeret al., Drug Discov. Today 3:323 [1998]; and Gonzales et al., Drug.Discov. Today4:431-39 [1999]). Examples of reporter molecules include,but are not limited to, FRET (florescence resonance energy transfer)systems (e.g., Cuo-lipids and oxonols, EDAN/DABCYL), calcium sensitiveindicators (e.g., Fluo-3, FURA 2, INDO 1, and FLUO3/AM, BAPTA AM),chloride-sensitive indicators (e.g., SPQ, SPA), potassium-sensitiveindicators (e.g., PBFI), sodium-sensitive indicators (e.g., SBFI), andpH sensitive indicators (e.g., BCECF).

[0279] In general, the host cells are loaded with the indicator prior toexposure to the compound. Responses of the host cells to treatment withthe compounds can be detected by methods known in the art, including,but not limited to, fluorescence microscopy, confocal microscopy (e.g.,FCS systems), flow cytometry, microfluidic devices, FLIPR systems (See,e.g., Schroeder and Neagle, J. Biomol. Screening 1:75 [1996]), andplate-reading systems. In some preferred embodiments, the response(e.g., increase in fluorescent intensity) caused by compound of unknownactivity is compared to the response generated by a known agonist andexpressed as a percentage of the maximal response of the known agonist.The maximum response caused by a known agonist is defined as a 100%response. Likewise, the maximal response recorded after addition of anagonist to a sample containing a known or test antagonist is detectablylower than the 100% response.

[0280] The cells are also useful in reporter gene assays. Reporter geneassays involve the use of host cells transfected with vectors encoding anucleic acid comprising transcriptional control elements of a targetgene (i.e., a gene that controls the biological expression and functionof a disease target) spliced to a coding sequence for a reporter gene.Therefore, activation of the target gene results in activation of thereporter gene product. In some embodiments, the reporter gene constructcomprises the 5′ regulatory region (e.g., promoters and/or enhancers) ofa protein whose expression is controlled by NPHP4 or inversin inoperable association with a reporter gene (See Example 4 and Inohara etal., J. Biol. Chem. 275:27823 [2000] for a description of the luciferasereporter construct pBVIx-Luc). Examples of reporter genes finding use inthe present invention include, but are not limited to, chloramphenicoltransferase, alkaline phosphatase, firefly and bacterial luciferases,β-galactosidase, β-lactamase, and green fluorescent protein. Theproduction of these proteins, with the exception of green fluorescentprotein, is detected through the use of chemiluminescent, colorimetric,or bioluminecent products of specific substrates (e.g., X-gal andluciferin). Comparisons between compounds of known and unknownactivities may be conducted as described above.

[0281] Specifically, the present invention provides screening methodsfor identifying modulators, i.e., candidate or test compounds or agents(e.g., proteins, peptides, peptidomimetics, peptoids, small molecules orother drugs) which bind to NPHP4 or inversin of the present invention,have an inhibitory (or stimulatory) effect on, for example, NPHP4 orinversin expression or NPHP4 or inversin activity, or have a stimulatoryor inhibitory effect on, for example, the expression or activity of aNPHP4 or inversin substrate. Compounds thus identified can be used tomodulate the activity of target gene products (e.g., NPHP4 or inversingenes) either directly or indirectly in a therapeutic protocol, toelaborate the biological function of the target gene product, or toidentify compounds that disrupt normal target gene interactions.Compounds which stimulate the activity of a variant NPHP4 or variantinversin or mimic the activity of a non-functional variant areparticularly useful in the treatment of cystic kidney diseases (e.g.,NPHP).

[0282] In one embodiment, the invention provides assays for screeningcandidate or test compounds that are substrates of a NPHP4 protein orinversin protein or polypeptide or a biologically active portionthereof. In another embodiment, the invention provides assays forscreening candidate or test compounds that bind to or modulate theactivity of a NPHP4 protein or inversin protein or polypeptide or abiologically active portion thereof.

[0283] The test compounds of the present invention can be obtained usingany of the numerous approaches in combinatorial library methods known inthe art, including biological libraries; peptoid libraries (libraries ofmolecules having the functionalities of peptides, but with a novel,non-peptide backbone, which are resistant to enzymatic degradation butwhich nevertheless remain bioactive; see, e.g., Zuckennann et al., J.Med. Chem. 37: 2678 [1994]); spatially addressable parallel solid phaseor solution phase libraries; synthetic library methods requiringdeconvolution; the ‘one-bead one-compound’ library method; and syntheticlibrary methods using affinity chromatography selection. The biologicallibrary and peptoid library approaches are preferred for use withpeptide libraries, while the other four approaches are applicable topeptide, non-peptide oligomer or small molecule libraries of compounds(Lam (1997) Anticancer Drug Des. 12:145).

[0284] Examples of methods for the synthesis of molecular libraries canbe found in the art, for example in: DeWitt et al., Proc. Natl. Acad.Sci. U.S.A. 90:6909 [1993]; Erb et al., Proc. Nad. Acad. Sci. USA91:11422 [1994]; Zuckermann et al., J. Med. Chem. 37:2678 [1994]; Cho etal., Science 261:1303 [1993]; Carrell et al., Angew. Chem. Int. Ed.Engl. 33.2059 [1994]; Carell et al., Angew. Chem. Int. Ed. Engl. 33:2061[1994]; and Gallop et al., J. Med. Chem. 37:1233 [1994].

[0285] Libraries of compounds may be presented in solution (e.g.,Houghten, Biotechniques 13:412-421 [1992]), or on beads (Lam, Nature354:82-84 [1991]), chips (Fodor, Nature 364:555-556 [1993]), bacteria orspores (U.S. Pat. No. 5,223,409; herein incorporated by reference),plasmids (Cull et al., Proc. Nad. Acad. Sci. USA 89:18651869 [1992]) oron phage (Scott and Smith, Science 249:386-390 [1990]; Devlin Science249:404-406 [1990]; Cwirla et al., Proc. Natl. Acad. Sci. 87:6378-6382[1990]; Felici, J. Mol. Biol. 222:301 [1991]).

[0286] In one embodiment, an assay is a cell-based assay in which a cellthat expresses a NPHP4 or inversin protein or biologically activeportion thereof is contacted with a test compound, and the ability ofthe test compound to modulate NPHP4activity or inversin activity isdetermined. Determining the ability of the test compound to modulateNPHP4 activity or inversin activity can be accomplished by monitoring,for example, changes in enzymatic activity. The cell, for example, canbe of mammalian origin.

[0287] The ability of the test compound to modulate NPHP4 binding orinversin binding to a compound, e.g., a NPHP4 substrate or inversinsubstrate, can also be evaluated. This can be accomplished, for example,by coupling the compound, e.g., the substrate, with a radioisotope orenzymatic label such that binding of the compound, e.g., the substrate,to NPHP4 or inversin can be determined by detecting the labeledcompound, e.g., substrate, in a complex.

[0288] Alternatively, the NPHP4 or inversin is coupled with aradioisotope or enzymatic label to monitor the ability of a testcompound to modulate NPHP4 binding or inversin binding to a NPHP4substrate or inversin substrate in a complex. For example, compounds(e.g., substrates) can be labeled with ¹²⁵I, ³⁵S ¹⁴C or ³H, eitherdirectly or indirectly, and the radioisotope detected by direct countingof radioemmission or by scintillation counting. Alternatively, compoundscan be enzymatically labeled with, for example, horseradish peroxidase,alkaline phosphatase, or luciferase, and the enzymatic label detected bydetermination of conversion of an appropriate substrate to product.

[0289] The ability of a compound (e.g., a NPHP4 substrate or inversinsubstrate) to interact with NPHP4 or inversin with or without thelabeling of any of the interactants can be evaluated. For example, amicrophysiometer can be used to detect the interaction of a compoundwith a NPHP4 or inversin without the labeling of either the compound orthe NPHP4 (McConnell et al. Science 257:1906-1912 [1992]). As usedherein, a “microphysiometer” (e.g., Cytosensor) is an analyticalinstrument that measures the rate at which a cell acidifies itsenvironment using a light-addressable potentiometric sensor (LAPS).Changes in this acidification rate can be used as an indicator of theinteraction between a compound and NPHP4 or inversin.

[0290] In yet another embodiment, a cell-free assay is provided in whicha NPHP4 protein or inversin protein or biologically active portionthereof is contacted with a test compound and the ability of the testcompound to bind to the NPHP4 protein or inversin protein or abiologically active portion thereof is evaluated. Preferred biologicallyactive portions of the NPHP4 proteins or inversin proteins to be used inassays of the present invention include fragments that participate ininteractions with substrates or other proteins, e.g., fragments withhigh surface probability scores.

[0291] Cell-free assays involve preparing a reaction mixture of thetarget gene protein and the test compound under conditions and for atime sufficient to allow the two components to interact and bind, thusforming a complex that can be removed and/or detected.

[0292] The interaction between two molecules can also be detected, e.g.,using fluorescence energy transfer (FRET) (see, for example, Lakowicz etal., U.S. Pat. No. 5,631,169; Stavrianopoulos et al., U.S. Pat. No.4,968,103; each of which is herein incorporated by reference). Afluorophore label is selected such that a first donor molecule's emittedfluorescent energy will be absorbed by a fluorescent label on a second,‘acceptor’ molecule, which in turn is able to fluoresce due to theabsorbed energy.

[0293] Alternately, the ‘donor’ protein molecule may simply utilize thenatural fluorescent energy of tryptophan residues. Labels are chosenthat emit different wavelengths of light, such that the ‘acceptor’molecule label may be differentiated from that of the ‘donor’. Since theefficiency of energy transfer between the labels is related to thedistance separating the molecules, the spatial relationship between themolecules can be assessed. In a situation in which binding occursbetween the molecules, the fluorescent emission of the ‘acceptor’molecule label in 15 the assay should be maximal. An FRET binding eventcan be conveniently measured through standard fluorometric detectionmeans well known in the art (e.g., using a fluorimeter).

[0294] In another embodiment, determining the ability of the NPHP4protein or inversin protein to bind to a target molecule can beaccomplished using real-time Biomolecular Interaction Analysis (BIA)(see, e.g., Sjolander and Urbaniczky, Anal. Chem. 63:2338-2345 [1991]and Szabo et al. Curr. Opin. Struct. Biol. 5:699-705 [1995]). “Surfaceplasmon resonance” or “BIA” detects biospecific interactions in realtime, without labeling any of the interactants (e.g., BlAcore). Changesin the mass at the binding surface (indicative of a binding event)result in alterations of the refractive index of light near the surface(the optical phenomenon of surface plasmon resonance (SPR)), resultingin a detectable signal that can be used as an indication of real-timereactions between biological molecules.

[0295] In one embodiment, the target gene product or the test substanceis anchored onto a solid phase. The target gene product/test compoundcomplexes anchored on the solid phase can be detected at the end of thereaction. Preferably, the target gene product can be anchored onto asolid surface, and the test compound, (which is not anchored), can belabeled, either directly or indirectly, with detectable labels discussedherein.

[0296] It may be desirable to immobilize NPHP4 or inversin, ananti-NPHP4 or anti-inversin antibody or their target molecules tofacilitate separation of complexed from non-complexed forms of one orboth of the proteins, as well as to accommodate automation of the assay.Binding of a test compound to a NPHP4 protein or inversin protein, orinteraction of a NPHP4 protein or inversin protein with a targetmolecule in the presence and absence of a candidate compound, can beaccomplished in any vessel suitable for containing the reactants.Examples of such vessels include microtiter plates, test tubes, andmicro-centrifuge tubes. In one embodiment, a fusion protein can beprovided that adds a domain that allows one or both of the proteins tobe bound to a matrix. For example, glutathione-S-transferase-NPHP4 orglutathione-S-transferase-inversin fusion proteins orglutathione-S-transferase/target fusion proteins can be adsorbed ontoglutathione Sepharose beads (Sigma Chemical, St. Louis, Mo.) orglutathione-derivatized microtiter plates, which are then combined withthe test compound or the test compound and either the non-adsorbedtarget protein or NPHP4 protein or inversin protein, and the mixtureincubated under conditions conducive for complex formation (e.g., atphysiological conditions for salt and pH). Following incubation, thebeads or microtiter plate wells are washed to remove any unboundcomponents, the matrix immobilized in the case of beads, complexdetermined either directly or indirectly, for example, as describedabove.

[0297] Alternatively, the complexes can be dissociated from the matrix,and the level of NPHP4 or inversin binding or activity determined usingstandard techniques. Other techniques for immobilizing either NPHP4protein or inversin protein or a target molecule on matrices includeusing conjugation of biotin and streptavidin. Biotinylated NPHP4 orinversin protein or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques known in the art (e.g.,biotinylation kit, Pierce Chemicals, Rockford, EL), and immobilized inthe wells of streptavidin-coated 96 well plates (Pierce Chemical).

[0298] In order to conduct the assay, the non-immobilized component isadded to the coated surface containing the anchored component. After thereaction is complete, unreacted components are removed (e.g., bywashing) under conditions such that any complexes formed will remainimmobilized on the solid surface. The detection of complexes anchored onthe solid surface can be accomplished in a number of ways. Where thepreviously non-immobilized component is pre-labeled, the detection oflabel immobilized on the surface indicates that complexes were formed.Where the previously non-immobilized component is not pre-labeled, anindirect label can be used to detect complexes anchored on the surface;e.g., using a labeled antibody specific for the immobilized component(the antibody, in turn, can be directly labeled or indirectly labeledwith, e.g., a labeled anti-IgG antibody).

[0299] This assay is performed utilizing antibodies reactive with NPHP4protein or inversin protein or target molecules but which do notinterfere with binding of the NPHP4 protein or inversin protein to itstarget molecule. Such antibodies can be derivatized to the wells of theplate, and unbound target or NPHP4 protein or inversin protein trappedin the wells by antibody conjugation. Methods for detecting suchcomplexes, in addition to those described above for the GST-immobilizedcomplexes, include immunodetection of complexes using antibodiesreactive with the NPHP4 protein or inversin protein or target molecule,as well as enzyme-linked assays which rely on detecting an enzymaticactivity associated with the NPHP4 protein or inversin protein or targetmolecule.

[0300] Alternatively, cell free assays can be conducted in a liquidphase. In such an assay, the reaction products are separated fromunreacted components, by any of a number of standard techniques,including, but not limited to: differential centrifugation (see, forexample, Rivas and Minton, Trends Biochem Sci 18:284-7 [1993]);chromatography (gel filtration chromatography, ion-exchangechromatography); electrophoresis (see, e.g., Ausubel et al., eds.Current Protocols in Molecular Biology 1999, J. Wiley: N.Y.); andimmunoprecipitation (see, for example, Ausubel et al., eds. CurrentProtocols in Molecular Biology 1999, J. Wiley: N.Y.). Such resins andchromatographic techniques are known to one skilled in the art (Seee.g., Heegaard J. Mol. Recognit 11:141-8 [1998]; Hageand Tweed J.Chromatogr. Biomed. Sci. Appl 699:499-525 [1997]). Further, fluorescenceenergy transfer may also be conveniently utilized, as described herein,to detect binding without further purification of the complex fromsolution.

[0301] The assay can include contacting the NPHP4 protein or inversinprotein or biologically active portion thereof with a known compoundthat binds the NPHP4 or inversin to form an assay mixture, contactingthe assay mixture with a test compound, and determining the ability ofthe test compound to interact with a NPHP4 protein or inversin protein,wherein determining the ability of the test compound to interact with aNPHP4 protein or inversin protein includes determining the ability ofthe test compound to preferentially bind to NPHP4 or inversin orbiologically active portion thereof, or to modulate the activity of atarget molecule, as compared to the known compound.

[0302] To the extent that NPHP4 or inversin can, in vivo, interact withone or more cellular or extracellular macromolecules, such as proteins,inhibitors of such an interaction are useful. A homogeneous assay can beused to identify inhibitors.

[0303] For example, a preformed complex of the target gene product andthe interactive cellular or extracellular binding partner product isprepared such that either the target gene products or their bindingpartners are labeled, but the signal generated by the label is quencheddue to complex formation (see, e.g., U.S. Pat. No. 4,109,496, hereinincorporated by reference, that utilizes this approach forimmunoassays). The addition of a test substance that competes with anddisplaces one of the species from the preformed complex will result inthe generation of a signal above background. In this way, testsubstances that disrupt target gene product-binding partner interactioncan be identified. Alternatively, NPHP4 protein or inversin protein canbe used as a “bait protein” in a two-hybrid assay or three-hybrid assay(see, e.g., U.S. Pat. No. 5,283,317; Zervos et al., Cell 72:223-232[1993]; Madura et al., J. Biol. Chem. 268.12046-12054 [1993]; Bartel etal., Biotechniques 14:920-924 [1993]; Iwabuchi et al., Oncogene8:1693-1696 [1993]; and Brent WO 94/10300; each of which is hereinincorporated by reference), to identify other proteins, that bind to orinteract with NPHP4 or inversin (“NPHP4-binding proteins” or “NPHP4-bp”or “inversin-binding proteins” or “inversin-bp”) and are involved inNPHP4 activity or inversin activity. Such NPHP4-bps or inversin-bps canbe activators or inhibitors of signals by the NPHP4 proteins or inversinproteins or targets as, for example, downstream elements of aNPHP4-mediated or inversin-mediated signaling pathway.

[0304] Modulators of NPHP4 expression or inversin expression can also beidentified. For example, a cell or cell free mixture is contacted with acandidate compound and the expression of NPHP4 mRNA or protein orinversin mRNA or protein evaluated relative to the level of expressionof NPHP4 mRNA or protein or inversin mRNA or protein in the absence ofthe candidate compound. When expression of NPHP4 mRNA or protein orinversin mRNA or protein is greater in the presence of the candidatecompound than in its absence, the candidate compound is identified as astimulator of NPHP4 mRNA or protein or inversin mRNA or proteinexpression. Alternatively, when expression of NPHP4 mRNA or protein orinversin mRNA or protein is less (i.e., statistically significantlyless) in the presence of the candidate compound than in its absence, thecandidate compound is identified as an inhibitor of NPHP4 mRNA orprotein or inversin mRNA or protein expression. The level of NPHP4 mRNAor protein or inversin mRNA or protein expression can be determined bymethods described herein for detecting NPHP4 mRNA or protein or inversinmRNA or protein.

[0305] A modulating agent can be identified using a cell-based or a cellfree assay, and the ability of the agent to modulate the activity of aNPHP4 protein or inversin protein can be confirmed in vivo, e.g., in ananimal such as an animal model for a disease (e.g., an animal withkidney disease; See e.g., Hildenbrandt and Otto, J. Am. Soc. Nephrol.11:1753 [2000]).

C. Therapeutic Agents

[0306] This invention further pertains to novel agents identified by theabove-described screening assays. Accordingly, it is within the scope ofthis invention to further use an agent identified as described herein(e.g., a NPHP4 or inversin modulating agent or mimetic, a NPHP4 orinversin specific antibody, or a NPHP4 or inversin binding partner) inan appropriate animal model (such as those described herein) todetermine the efficacy, toxicity, side effects, or mechanism of action,of treatment with such an agent. Furthermore, novel agents identified bythe above-described screening assays can be, e.g., used for treatmentsof cystic kidney disease (e.g., including, but not limited to, NPHPkidney disease).

IX. Pharmaceutical Compositions Containing NPHP4 Nucleic Acid, Peptides,and Analogs

[0307] The present invention further provides pharmaceuticalcompositions which may comprise all or portions of NPHP4 polynucleotidesequences, NPHP4 polypeptides, inhibitors or antagonists of NPHP4bioactivity, including antibodies, alone or in combination with at leastone other agent, such as a stabilizing compound, and may be administeredin any sterile, biocompatible pharmaceutical carrier, including, but notlimited to, saline, buffered saline, dextrose, and water.

[0308] The methods of the present invention find use in treatingdiseases or altering physiological states characterized by mutant NPHP4alleles (e.g., NPHP type 4 kidney disease or RP). Peptides can beadministered to the patient intravenously in a pharmaceuticallyacceptable carrier such as physiological saline. Standard methods forintracellular delivery of peptides can be used (e.g., delivery vialiposome). Such methods are well known to those of ordinary skill in theart. The formulations of this invention are useful for parenteraladministration, such as intravenous, subcutaneous, intramuscular, andintraperitoneal. Therapeutic administration of a polypeptideintracellularly can also be accomplished using gene therapy as describedabove.

[0309] As is well known in the medical arts, dosages for any one patientdepends upon many factors, including the patient's size, body surfacearea, age, the particular compound to be administered, sex, time androute of administration, general health, and interaction with otherdrugs being concurrently administered.

[0310] Accordingly, in some embodiments of the present invention, NPHP4nucleotide and NPHP4 amino acid sequences can be administered to apatient alone, or in combination with other nucleotide sequences, drugsor hormones or in pharmaceutical compositions where it is mixed withexcipient(s) or other pharmaceutically acceptable carriers. In oneembodiment of the present invention, the pharmaceutically acceptablecarrier is pharmaceutically inert. In another embodiment of the presentinvention, NPHP4 polynucleotide sequences or NPHP4 amino acid sequencesmay be administered alone to individuals subject to or suffering from adisease.

[0311] Depending on the condition being treated, these pharmaceuticalcompositions may be formulated and administered systemically or locally.Techniques for formulation and administration may be found in the latestedition of “Remington's Pharmaceutical Sciences” (Mack Publishing Co,Easton Pa.). Suitable routes may, for example, include oral ortransmucosal administration; as well as parenteral delivery, includingintramuscular, subcutaneous, intramedullary, intrathecal,intraventricular, intravenous, intraperitoneal, or intranasaladministration.

[0312] For injection, the pharmaceutical compositions of the inventionmay be formulated in aqueous solutions, preferably in physiologicallycompatible buffers such as Hanks' solution, Ringer's solution, orphysiologically buffered saline. For tissue or cellular administration,penetrants appropriate to the particular barrier to be permeated areused in the formulation. Such penetrants are generally known in the art.

[0313] In other embodiments, the pharmaceutical compositions of thepresent invention can be formulated using pharmaceutically acceptablecarriers well known in the art in dosages suitable for oraladministration. Such carriers enable the pharmaceutical compositions tobe formulated as tablets, pills, capsules, liquids, gels, syrups,slurries, suspensions and the like, for oral or nasal ingestion by apatient to be treated.

[0314] Pharmaceutical compositions suitable for use in the presentinvention include compositions wherein the active ingredients arecontained in an effective amount to achieve the intended purpose. Forexample, an effective amount of NPHP4 may be that amount that suppressesapoptosis. Determination of effective amounts is well within thecapability of those skilled in the art, especially in light of thedisclosure provided herein.

[0315] In addition to the active ingredients these pharmaceuticalcompositions may contain suitable pharmaceutically acceptable carrierscomprising excipients and auxiliaries that facilitate processing of theactive compounds into preparations that can be used pharmaceutically.The preparations formulated for oral administration may be in the formof tablets, dragees, capsules, or solutions.

[0316] The pharmaceutical compositions of the present invention may bemanufactured in a manner that is itself known (e.g., by means ofconventional mixing, dissolving, granulating, dragee-making, levigating,emulsifying, encapsulating, entrapping or lyophilizing processes).

[0317] Pharmaceutical formulations for parenteral administration includeaqueous solutions of the active compounds in water-soluble form.Additionally, suspensions of the active compounds may be prepared asappropriate oily injection suspensions. Suitable lipophilic solvents orvehicles include fatty oils such as sesame oil, or synthetic fatty acidesters, such as ethyl oleate or triglycerides, or liposomes. Aqueousinjection suspensions may contain substances that increase the viscosityof the suspension, such as sodium carboxymethyl cellulose, sorbitol, ordextran. Optionally, the suspension may also contain suitablestabilizers or agents that increase the solubility of the compounds toallow for the preparation of highly concentrated solutions.

[0318] Pharmaceutical preparations for oral use can be obtained bycombining the active compounds with solid excipient, optionally grindinga resulting mixture, and processing the mixture of granules, afteradding suitable auxiliaries, if desired, to obtain tablets or drageecores. Suitable excipients are carbohydrate or protein fillers such assugars, including lactose, sucrose, mannitol, or sorbitol; starch fromcorn, wheat, rice, potato, etc; cellulose such as methyl cellulose,hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; andgums including arabic and tragacanth; and proteins such as gelatin andcollagen. If desired, disintegrating or solubilizing agents may beadded, such as the cross-linked polyvinyl pyrrolidone, agar, alginicacid or a salt thereof such as sodium alginate.

[0319] Dragee cores are provided with suitable coatings such asconcentrated sugar solutions, which may also contain gum arabic, talc,polyvinylpyrrolidone, carbopol gel, polyethylene glycol, and/or titaniumdioxide, lacquer solutions, and suitable organic solvents or solventmixtures. Dyestuffs or pigments may be added to the tablets or drageecoatings for product identification or to characterize the quantity ofactive compound, (i.e., dosage).

[0320] Pharmaceutical preparations that can be used orally includepush-fit capsules made of gelatin, as well as soft, sealed capsules madeof gelatin and a coating such as glycerol or sorbitol. The push-fitcapsules can contain the active ingredients mixed with a filler orbinders such as lactose or starches, lubricants such as talc ormagnesium stearate, and, optionally, stabilizers. In soft capsules, theactive compounds may be dissolved or suspended in suitable liquids, suchas fatty oils, liquid paraffin, or liquid polyethylene glycol with orwithout stabilizers.

[0321] Compositions comprising a compound of the invention formulated ina pharmaceutical acceptable carrier may be prepared, placed in anappropriate container, and labeled for treatment of an indicatedcondition. For polynucleotide or amino acid sequences of NPHP4,conditions indicated on the label may include treatment of conditionrelated to apoptosis.

[0322] The pharmaceutical composition may be provided as a salt and canbe formed with many acids, including but not limited to hydrochloric,sulfuric, acetic, lactic, tartaric, malic, succinic, etc. Salts tend tobe more soluble in aqueous or other protonic solvents that are thecorresponding free base forms. In other cases, the preferred preparationmay be a lyophilized powder in 1 mM-50 mM histidine, 0.1%-2% sucrose,2%-7% mannitol at a pH range of 4.5 to 5.5 that is combined with bufferprior to use.

[0323] For any compound used in the method of the invention, thetherapeutically effective dose can be estimated initially from cellculture assays. Then, preferably, dosage can be formulated in animalmodels (particularly murine models) to achieve a desirable circulatingconcentration range that adjusts NPHP4 levels.

[0324] A therapeutically effective dose refers to that amount of NPHP4that ameliorates symptoms of the disease state. Toxicity and therapeuticefficacy of such compounds can be determined by standard pharmaceuticalprocedures in cell cultures or experimental animals, e.g., fordetermining the LD₅₀ (the dose lethal to 50% of the population) and theED₅₀ (the dose therapeutically effective in 50% of the population). Thedose ratio between toxic and therapeutic effects is the therapeuticindex, and it can be expressed as the ratio LD₅₀/ED₅₀. Compounds thatexhibit large therapeutic indices are preferred. The data obtained fromthese cell culture assays and additional animal studies can be used informulating a range of dosage for human use. The dosage of suchcompounds lies preferably within a range of circulating concentrationsthat include the ED₅₀ with little or no toxicity. The dosage varieswithin this range depending upon the dosage form employed, sensitivityof the patient, and the route of administration.

[0325] The exact dosage is chosen by the individual physician in view ofthe patient to be treated. Dosage and administration are adjusted toprovide sufficient levels of the active moiety or to maintain thedesired effect. Additional factors which may be taken into accountinclude the severity of the disease state; age, weight, and gender ofthe patient; diet, time and frequency of administration, drugcombination(s), reaction sensitivities, and tolerance/response totherapy. Long acting pharmaceutical compositions might be administeredevery 3 to 4 days, every week, or once every two weeks depending onhalf-life and clearance rate of the particular formulation.

[0326] Normal dosage amounts may vary from 0.1 to 100,000 micrograms, upto a total dose of about 1 g, depending upon the route ofadministration. Guidance as to particular dosages and methods ofdelivery is provided in the literature (See, U.S. Pat. Nos. 4,657,760;5,206,344; or 5,225,212, all of which are herein incorporated byreference). Those skilled in the art will employ different formulationsfor NPHP4 than for the inhibitors of NPHP4. Administration to the bonemarrow may necessitate delivery in a manner different from intravenousinjections.

Experimental

[0327] The following examples are provided in order to demonstrate andfurther illustrate certain preferred embodiments and aspects of thepresent invention and are not to be construed as limiting the scopethereof.

[0328] In the experimental disclosure which follows, the followingabbreviations apply: eq (equivalents); M (Molar); μM (micromolar); N(Normal); mol (moles); mmol (millimoles); μmol (micromoles); nmol(nanomoles); g (grams); mg (milligrams); μg (micrograms); ng(nanograms); 1 or L (liters); ml (milliliters); μl (microliters); cm(centimeters); mm (millimeters); μm (micrometers); nm (nanometers); ° C.(degrees Centigrade); U (units), mU (milliunits); min. (minutes); sec.(seconds); % (percent); kb (kilobase); bp (base pair); PCR (polymerasechain reaction); BSA (bovine serum albumin); Fisher (Fisher Scientific,Pittsburgh, Pa.); Sigma (Sigma Chemical Co., St. Louis, Mo.); Promega(Promega Corp., Madison, Wis.); Perkin-Elmer (Perkin-Elmer/AppliedBiosystems, Foster City, Calif.); Boehringer Mannheim (BoehringerMannheim, Corp., Indianapolis, Ind.); Clonetech (Clonetech, Palo Alto,Calif.); Qiagen (Qiagen, Santa Clarita, Calif.); Stratagene (StratageneInc., La Jolla, Calif.); National Biosciences (National Biosciences Inc,Plymouth Minn.) and NEB (New England Biolabs, Beverly, Mass.), wt(wild-type); Ab (antibody); NPHP (nephronophthisis); SLS (Senior-Lkensyndrome); RP (retinitis pigmentosa) and ESRD (end stage renal disease).

EXAMPLE 1 A. Methods Pedigree and Diagnosis

[0329] Blood samples and pedigrees were obtained following informedconsent from patients with NPHP and their parents. Diagnostic criteriawere (i) development of ESRD following a history of polyuria,polydipsia, and anemia; (ii) renal ultrasound compatible with NPHP. Inall families with the exception of F461 the diagnosis of NPHP wasconfirmed by renal biopsy. ESRD developed within a range of 6-35 yearswith a median age of 22 years (Table 1). In SLS, the renal symptoms areassociated with RP. Clinical data for SLS family F3 have been publishedpreviously (Polak et al., Am J Ophthalmol 95:487-94 [1983]; Schuermannet al., Am J Hum Genet 70:1240-1246 [2002]; herein incorporated byreference). All three affected siblings had RP suggestive of Leberamaurosis congenital. Ophthalmologic data for family F60 has beenpublished (Fillastre et al., Clin Nephrol 5:14-19 [1976]; hereinincorporated by reference) and comprises: In J. C. (Fillastre et al.1976, supra) amblyopia and rotary nystagmus with grossly impaired visionstarting age 8 months, and on fundoscopy retino-choroidal atrophysurrounded by pigment. In individuals M.C.B. and M.M.B. there wereabnormal ERG findings with diminished amplitude (Fillastre et al.1976,supra).

Haplotype and Mutational Analysis

[0330] The “screening markers” used for haplotype analysis consisted ofmicrosatellites markers D1S2845, D1S2660, D1S2795, D1S2870, D1S2642,D1S214, D1S2663, D1S1612 (in pter to cen orientation) (Dib et al.,Nature 380:152 [1996]). Novel microsatellite markers were generated bysearching for di-, tri-, and tetra-nucleotide repeats using the BLASTprogram on human genomic sequence in the interval between flankingmarkers D1S2660 and D1S2642. Preparation of genomic DNA and haplotypeanalysis were performed as described previously (Schuermann et al. 2002,supra). Mutational analysis was performed using exon-flanking primers asdescribed previously (Schuermann et al. 1996). Markers are shown inTable 2. TABLE 2 Primer sequences (from 5′ to 3′) used in exonamplification for mutational analysis of NPHP4. Product Size ExonForward Primer Reverse Primer (bp) 1 gtcggacatgcaaatcaggaggctctggccaacactg 439 (SEQ ID NO:21) (SEQ ID NO:51) 2aagccttcaggattgctgtg catccatctgttaactggaagc 319 (SEQ ID NO:22) (SEQ IDNO:52) 3 acatggcctgccagtgac cctggacccacaagtctgag 346 (SEQ ID NO:23) (SEQID NO:53) 4 acgtgtaggaaggcggtctc gacgagcagttaaaccaccatag 649 (SEQ IDNO:24) (SEQ ID NO:54) 5 gaggcctccatgtgctttc gctaaaggtggggaacactc 209(SEQ ID NO:25) (SEQ ID NO:55) 6 tgaccctcattgagaactgcgtgccttcaaggtttcactg 217 (SEQ ID NO:26) (SEQ ID NO:56) 7ttgtgctctgtctgggagtc catcagatgcggggtctc 439 (SEQ ID NO:27) (SEQ IDNO:57) 8 ctcccccagggacttctg cctgacatgcacaaatgacc 335 (SEQ ID NO:28) (SEQID NO:58) 9 ttctgacagtggtcgacgtg tgcccactacatttatcctcac 279 (SEQ IDNO:29) (SEQ ID NO:59) 10 cactgttgatttcccctctc gcaaacatatttgtgaacttttgc343 (SEQ ID NO:30) (SEQ ID NO:60) 11 ttcctggttggatcgttctgcgacgattatcttacaaatgtgg 329 (SEQ ID NO:31) (SEQ ID NO:61) 12aggcctgtggagacctgac ggggacagagggttttcttg 232 (SEQ ID NO:32) (SEQ IDNO:62) 13 catgttgggagctttgtgg gacaggcacagtgcaaaaac 262 (SEQ ID NO:33)(SEQ ID NO:63) 14 atctgagcaccgttggttg gggttcacaaggtccaacag 295 (SEQ IDNO:34) (SEQ ID NO:64) 15 ggtttccacagggaggtg aggtcagaacctcagcgaag 345(SEQ ID NO:35) (SEQ ID NO:65) 16 accatcccctatgcaaacacgcactggtcaccgtatgattc 409 (SEQ ID NO:36) (SEQ ID NO:66) 17gaccagagctgaaatctctt acgctggaagcgtgactc 315 (SEQ ID NO:37) (SEQ IDNO:67) 18 cacagtggctttcctgctg cgagggagcccacactctac 358 (SEQ ID NO:38)(SEQ ID NO:68) 19 tgtggtgggttgatctgttt cactgacagcaccacgaatg 332 (SEQ IDNO:39) (SEQ ID NO:69) 20 ccctggtgtctgctcctg gaggcagggaaaggatgtg 351 (SEQID NO:40) (SEQ ID NO:70) 21 agcaatagccccttgtggag tctcgggcagaattcgag 386(SEQ ID NO:41) (SEQ ID NO:7l) 22 tctctcccactcctctgagcagggacactggtggagactg 377 (SEQ ID NO:42) (SEQ ID NO:72) 23tggcagtggtgtctctaagc aggaggggagagaaggacac 251 (SEQ ID NO:43) (SEQ IDNO:73) 24 ttggcaacagtggagatacg catgaggccatctgtcacc 342 (SEQ ID NQ:44)(SEQ ID NO:74) 25 tcttgctgagcacctgtgac aggatacccgtggggaag 282 (SEQ IDNO:45) (SEQ ID NO:75) 26 cactcgctgcgtgtattagt caagcccactttcaatccac 268(SEQ ID NO:46) (SEQ ID NO:76) 27 ccttgttggcctctcgtg ccagctgaatgcccactg318 (SEQ ID NO:47) (SEQ ID NO:77) 28 ggaaccacccatgaccttgcagtggtccgagtcacagg 388 (SEQ ID NO:48) (SEQ ID NO:78) 29cagggaatacttggaggaag gaggaactcgctcctaaatgc 310 (SEQ ID NO:49) (SEQ IDNO:79) 30 gcagagaggttgctggtgag accgggcttgtgctgtag 738 (SEQ ID NO:50)(SEQ ID NO:80)

Northern Blot Analysis

[0331] A multiple tissue Northern blot with human adult poly(A)+RNA(Clontech MTN7760-1) was hybridized with a NPHP4 DNA probe of 584 bp,derived from exon 30 (nt 4141-4724; see FIG. 4) generated by PCRamplification of human genomic DNA. The probe was labeled with [³²P]dCTPusing Random Primers DNA Labeling System (Invitrogen). Hybridization wascarried out at 68° C. using EXPRESSHYB solution (Clontech, Paolo Alto,Calif.). The final washing condition was 0.1×SSC, 0.1% SDS at 50° C. for40 min.

Results

[0332] A gene locus (NPHP4) for NPHP type 4 was mapped by total genomesearch for linkage within a 2.1 Mb interval delimited by flankingmarkers D1S2660 and D1S2642 (Schuermann et al. 1996). To establishcompatibility with linkage to NPHP4 in further kindred, 20 HP familieswith multiple affected children or parental consanguinity, in whom nomutation was present in the NPHP1 gene, were selected. In 8 familiesthere was an association of NPHP with retinitis pigmentosa (RP).Haplotype analysis using 8 microsatellite markers covering the criticalNPHP4 region (Schuermann et al. 2002, supra; herein incorporated byreference) was compatible with linkage to NPHP4 in 9 families, including2 families with RP. To further refine the critical genetic interval of2.1 Mb, high-resolution haplotype analysis was performed in these 9families and the 7 families with linkage to NPHP4 published previously(Schuermann et al., 2002, supra). In 2 families (F3, F60) NPHP wasassociated with RP. Eight published (Dib et al. 1996, supra) and 38newly generated microsatellite markers were used at an average markerdensity of 1 marker per 45 kb within the interval of flanking markersD1S2660and D1S2642 (FIG. 1). Haplotype analysis, by the criterion ofminimization of recombinants, clearly revealed erroneous inversion ofsequence between markers D1S2795 and D1S244 in human genomic sequencedata bases (www.ensembl.org).

[0333] Using high resolution haplotype data, the correct marker order atthe NPHP4 locus was established aspter-D1S2660-D1S2795-D1S2633-D1S2870-D1S253-D1S2642-D1S214-D1S1612-D1S2663-D1S244-cen(flanking markers to NPHP4 underlined). A 22 kb sequence gap remainingin the interval D1S2660-D1S2795 was filled by use of CELERA humangenomic sequence. In haplotype analysis, 3 consanguineous kindredyielded new key recombinants by the criterion of homozygosity by descent(Lander and Botstein, Science 236: 1567 [1987]) (FIG. 1). The NPHP4critical genetic interval was thus refined to <1.2 Mb within secureborders based on a large kindred, and in addition, to <700 kb withinsuggestive borders based on 2 small families (FIG. 1, FIGS. 2A, B).Within the 700 kb critical interval for NPHP4 there mapped 3 known genes(KCNAB2, RPL22, and ICMT), and 3 unknown genes (Q9UFQ2, Q9UFR9, andQ96MP2) (FIG. 2B). In addition, in the interval between Q9UFQ2 andflanking marker D1E19 (FIG. 2B) the program GENESCAN predictedapproximately 40 non-annotated exons (www.ensembl.org). Mutationalanalysis was performed in affected individuals of the 16 familiescompatible with linkage to NPHP4, examining all 79 exons of the 3 knownand 3 unknown genes by direct sequencing of the forward strands ofexon-PCR products. While no mutations were detected in 5 of these genes,in Q9UFQ2 detected 11 distinct mutations were detected in 8 of the 16families with NPHP (Table 1). In families F3 and F60 NPHP is associatedwith RP. In the affected individuals from all 8 families, mutations wereshown to segregate from both parents (Table 1). All of these mutationswere absent from 92-96 healthy control individuals. Nine of the 11mutations detected represent very likely loss-of-function mutations: 5were STOP codon, 1 frame shift, and 3 were obligatory splice consensusmutations (Table 1 and FIG. 2D and 6-16.). Q9UFQ2 was thus identified asthe gene causing NPHP type 4. The gene was termed NPHP4 and therespective gene product was called “nephroretinin” for its role innephronophthisis and retinitis pigmentosa. In the 5 consanguineousfamilies F3, F30, F32, F60, and F622, all mutations occurred in thehomozygous state and represented STOP codon mutations and one frameshift mutation, truncating the protein in exons 18, 23, 11, 16, and 18,respectively (Table 1; FIGS. 2D, E). In the 3 non-consanguineousfamilies, 6 distinct compound heterozygous mutations were found. Fourrepresented STOP codon or obligatory splice consensus mutations,truncating the gene product in exons 15, 16, 17, and 24. The missensemutations R848W and G754R affect amino acid residues conserved in mouseand cow. No mutations were detected in 8 families.

[0334] NPHP4 expression studies by northern blot analysis revealed a 5.9kb transcript strongly expressed in human skeletal muscle, weakly inkidney, and in 6 additional tissues studied (FIG. 3). Northern dot blotanalysis confirmed a widespread expression pattern in human adult andfetal tissues including testis. This broad expression pattern, withstrong expression in skeletal muscle and testis corresponds well withthe expression pattern described for the NPHP1 gene (Otto et al., J. Am.Soc. Nephrol. 11:270 [2000]).

[0335] Human genomic sequence of NPHP4 (KIAA0673) was assembled usingthe homo sapiens chromosome 1 working draft sequence segmentNT_(—)028054, which predicted 25 exons. Five additional 5′ exons wereidentified using additional working draft sequence, the mRNA KIAA00673and 57 human ESTs from the UniGene cluster Hs. 106487. The genomicstructure shown in FIGS. 2C, D and FIG. 4 was confirmed by human/mousetotal genomic sequence comparison. The NPHP4 gene contains 30 exonsencoding 1426 amino acids and extends over 130 kb, with splice sitesthat confirm to the canonical consensus gt-ag. An exception was found inintron 24, with gc-ag splicing, which occurs in 0.5% of mammalian splicesites (Burset et al., Nuc. Acid. Res. 29:255 [2001]). A polymorphism isknown to be present at the intron 20 splice acceptor (tg for ag).Presence of exon 20 is supported by 3 human EST clones. Ten differentsplice variants have been suggested for KIAA0673 (See e.g., the Internetweb site of NCBI).

[0336] The NPHP4 cDNA (FIG. 4) and deduced nephroretinin proteinsequences were found to be novel, without any sequence similarity toknown human cDNA or protein sequences. Therefore, NPHP4 encodes ahitherto unknown protein. As shown for the NPHP1 gene productnephrocystin (Hildebrandt et al., Nature Genet. 17:149 [1997]; Otto etal., J. Am. Soc. Nephrol. 11:270 [2000]), there was however strongsequence conservation for nephroretinin in evolution with 23% amino acididentity in a protein of C. elegans (FIG. 5). Translated EST sequencesalso demonstrated evolutionary conservation in mouse, cow, pig,zebrafish, Xenopus laevis, Ascaris suum, and Halocynthia roretzi.Sequence identity of the murine homologue was 78% (FIG. 5). Analysis ofnephroretinin amino acid sequence provided no signal sequence, conserveddomains, or predicted transmembrane regions. In the N-terminal halfthere was a putative nuclear localization signal (NLS), a glutamate-rich(E-rich) and a proline-rich (P-rich) domain. The latter two have alsobeen found in nephrocystin (Otto et al., [2000], supra). No sequencesimilarity to nephrocystin was present. In addition, 2 serine rich(S-rich) sequences and a C-terminal endoplasmic reticulum membranedomain were found in human and murine nephroretinin sequences. Encodedby exons 15 and 16, there were was in nephroretinin a domain of unknownfunction (DUF339) with evolutionary conservation including prokaryotesand a 63 amino acid stretch with 30% sequence identity to a gas vesicleprotein of Halobacterium salinarium (FIG. 5). TABLE 1 Clinical detailsand mutations detected in families with NPHP4 Number Retinitis ParentalNucleotide Effect on coding Family of affecteds ESRD at age pigmentosaOrigin consanguinity Exon change^(b) sequence Segregation^(c) F3^(a) 328 y, 30 y, 35 y yes Turkey yes 18 C2335T Q779X hom F24 2 ND no Germanyno 17 G2260A G754R P 17 IVS16 − 1 G > C Splice site M F30^(a) 3 18 y, 22y, 22 y no Germany yes 23 3272delT STOP at hom codon L1121 F32 2 19 y,20 y no India yes 11 TC1334-1335AA F445X hom F60 4 6 y, 10 y, 17 y, 22 yyes France yes 16 C1972T R658X hom F444^(a) 2 23 y, 33 y no Finland no15 IVS15 + 1 G > A Splice site M 24 IVS24 + 1 G > A Splice site PF461^(a) 3 ND no France no 16 C2044T R682X P 19 C2542T R848W M F622 2 8y, 9 y no Afghanistan yes 18 G2368T E790X hom

EXAMPLE 2 Mutations in INVS Cause NPHP2

[0337] Mutational analysis was performed on 16 exons of INVS in genomicDNA from nine affected individuals from seven different families withearly onset of NPHP. One individual (from family A7) was included fromthe initial description (Gagnadoux et al., Pediatr. Nephrol. 3, 50[1989]) of infantile NPHP (individual 5) and two affected siblings(VII-1 and VII-3 in family A12) from the Bedouin kindred (Haider et al.,Am. J. Hum. Genet. 63, 1404 [1998]) in which the NPHP2 locus was firstmapped (Table 3). Nine distinct recessive mutations were detected inINVS (Table 3 and FIG. 15). In six individuals, both mutated alleleswere detected. In individual A10, only one heterozygous mutation wasfound.

[0338] Mutations in INVS (nucleotide exchange and amino acid exchange)are shown (FIG. 15a) together with sequence traces for mutated sequence(top) and sequence from healthy controls (bottom). Family numbers aregiven above boxes. If only one mutation is shown, it occurred in thehomozygous state, except in individual A10, in whom only one mutation inthe heterozygous state was detected. In individual 868, the 2742insAmutation is shown in the flipped version of the reverse strand. The exonstructure of INVS is shown in FIG. 15b. Lines indicate relativepositions and connect to mutations detected in INVS. Open and filledboxes represent INVS exons drawn relative to scale bar. Positions ofstart codon (ATG) at nucleotide +1 and of stop codon (TGA) areindicated. A representation of protein motifs drawn to scale parallel toexon structure is shown (FIG. 15c). Lines connect to point mutationsdetected, as shown in FIGS. 15a and 15 d).

EXAMPLE 3 Inversin Associates with Nephrocystin in HEK293T Cells andMouse Tissue

[0339] Myc-tagged nephrocystin (Myc-NPHP1) was coexpressed withN-terminally FLAG-tagged full-length inversin (FLAG-INV) or FLAG-taggedTRAF2 (FLAG-TRAF2) protein as a negative control. Afterimmunoprecipitation with anti-FLAG antibody, coprecipitatingnephrocystin was detected with nephrocystin-specific antiserum (FIG.26a, left panel). Protein expression levels in cellular lysates werecontrolled by immunoblotting using a nephrocystin antibody (FIG. 26a,middle panel) or FLAG-specific and nephrocystin-specific antibodies(FIG. 26a, right panel). Molecular weight markers are shown in kDa.Full-length nephrocystin was fused to the CH2 and CH3 domains of humanIgG1 and precipitated with protein G sepharose beads. FLAG-taggedinversin specifically coprecipitated with nephrocystin but not withcontrol protein (CH2 and CH3 domains of human IgG1 without nephrocystinfusion) as shown with FLAG-specific antibody (FIG. 26b). FLAG-taggednephrocystin or FLAG-tagged TRAF2 protein as a negative control wascoexpressed with N-terminally Myc-tagged full-length inversin (Myc-INV).After immunoprecipitation with anti-FLAG antibody, coprecipitatinginversin was detected with inversin-specific antiserum (FIG. 26c, leftand middle panels). Appropriate controls were also run (FIG. 26c, rightpanel). A rabbit antiserum to a MBP-inversin fusion protein (amino acids561-716 of mouse inversin) specifically recognized inversin (amino acids1-716) expressed in HEK293T cells (FIG. 26d, left panel) but not theFLAG-tagged control proteins podocin (FLAG-podocin), nephrocystin(FLAG-NPHP1) or PACS-1 (FLAG-PACS-1, amino acids 85-280) (FIG. 26d, leftpanel). It also specifically recognized recombinant GST-inversin (aminoacids 561-716) but not two other control GST fusion proteins (FIG. 26d,lower panel). To show endogenous nephrocystin-inversin interaction invivo in mouse kidney, half of mouse kidney tissue lysates wasimmunoprecipitated with a control antibody to hemagglutinin (anti-HA),and the other half was precipitated with anti-nephrocystin antisera.Immobilized inversin was detected with the inversin-specific antisera(FIG. 26e, right upper panel). Precipitation of endogenous nephrocystinwas confirmed by reprobing the blot for nephrocystin (FIG. 26e, rightlower panel). Appropriate controls are also shown (FIG. 26e, eftpanels).

EXAMPLE 4 β-tubulin is a Nephrocystin Interaction Partner

[0340] In order to identify nephrocystin-interacting proteins, HEK 293Tcells were transfected with the FLAG-tagged control protein GFP orFLAG-tagged nephrocystin. Specific association of β-tubulin withnephrocystin was confirmed by immunoblotting of 2D gels using antiβ-tubulin antibody (FIG. 27a). Several FLAG-tagged nephrocystintruncations were generated to analyze the interaction of nephrocystinwith β-tubulin. Endogenous β-tubulin precipitated with transfectedfull-length nephrocystin but not with the control proteins GFP or TRAF2(FIG. 27b, upper panel). Expression of native β-tubulin in lysates isalso shown (FIG. 27b, middle panel). The membrane depicted in FIG. 27b,middle panel, was reprobed with anti-FLAG antibody and shows thatβ-tubulin is still detected below the 62 kDa marker, confirmingcomparable expression levels of the FLAG-tagged proteins (FIG. 27b,lower panel). The interaction was mapped to a region of nephrocystininvolving amino acids 237-670 (FIG. 27c, upper panel) with theexpression levels of β-tubulin shown as a control (FIG. 27c, bottompanel). The membrane was reprobed with anti-FLAG antibody to confirmexpression of the FLAG-tagged proteins in the lysates (FIG. 27c, lowerpanel). Endogenous β-tubulin coprecipitates with native nephrocystin inciliated mCcd-K1 cells (FIG. 27d).

EXAMPLE 5 Inversin and Nephrocystin Colocalize with β-tubulin to Cilia

[0341] Nephrocystin and β-tubulin-4 colocalize in primary cilia of MDCKcells (FIG. 28a, upper and lower panels). Wild-type MDCK cells (cloneII) were grown on coverslips at 100% confluence and cultivated for 7 dbefore the experiment to allow full polarization and cilia formation.Localization of nephrocystin was determined by immunofluorescence usingnephrocystin-specific antibody with confocal images captured at thelevel of the apical membrane. Cells were costained with rabbit antibodyto nephrocystin (FIG. 28a, left panels) and mouse antibody toβ-tubulin-4 (FIG. 28a, middle panels) followed by the respectivesecondary antibodies. Specific localization of nephrocystin in primarycilia was confirmed by the use of blocking recombinant nephrocystinprotein (FIG. 28b). Inversin localizes to primary cilia in MDCK cells(FIG. 28c). Localization of endogenous inversin was determined byimmunofluorescence using inversin-specific antibody with confocal imagescaptured at the level of the apical membrane. Cells were costained withmouse antibody to β-tubulin-4 and rabbit antibody to inversin followedby the respective secondary antibodies (FIG. 28c, lower panel). Inadditional stainings, the antibody to β-tubulin-4 was omitted to reducepotential spectral overlap between the inversin and β-tubulin-4 signals(FIG. 28c, upper panel). Partial colocalization of nephrocystin andinversin in primary cilia is observed (FIG. 28d). Localization ofnephrocystin was determined by immunofluorescence usingnephrocystin-specific antibody with confocal images captured at thelevel of the apical membrane. Cells were costained with goat antibody toinversin (FIG. 28d, left panel) and rabbit antibody to nephrocystin(FIG. 28d, middle panel) followed by the respective secondaryantibodies. Partial colocalization is shown (FIG. 28d, right panel).

EXAMPLE 6 Disruption of Zebrafish invs Function Results in Renal CystFormation

[0342] It was determined that embryos injected with a control,non-specific oligonucleotide have normal morphology (FIG. 29a) whereasembryos injected with atgMO and spMO have a pronounced ventral axiscurvature at 3 d.p.f (combined totals for atgMO and spMO: 432 of 479injected embryos; 90%) (FIG. 29b). Coinjection of 100 pg mouse Invs mRNAwith spMO completely rescued axis curvature defects (combined totals foratgMO and spMO: 363 of 381 mRNA+MO injected embryos were rescued;95%).(FIG 29 c). FIG. 29d shows a histological section of a 2.5-d.p.fcontrol embryo pronephros showing the midline glomerulus (Gl),pronephric tubule (Pt) and pronephric duct (Pd). FIG. 29e shows anatgMO-injected 3-d.p.f. embryo showing cystic dilatation of pronephrictubules and glomerulus (indicated with an asterisk) lined with squamousepithelium. FIG. 29f shows that spMO similarly causes cysticmaldevelopment of the pronephric tubules (marked with an asterisk).Molecular analysis of morpholino targeted invs splicing defects wasperformed. RT-PCR analysis of invs expression in 24-h.p.f. controlinjected embryos generates a 746-bp invs fragment encoding theC-terminal domain (FIG. 29g, lane C, nucleotides 2,233-2,979 of GenBankAF465261; lane M, φX174 markers). spMO-injected embryos analyzed withthe same RT-PCR primers generate a 189-bp RT-PCR product representing aC-terminal invs deletion allele (FIG. 29g, lanes spMO; 24, 48 and 72h.p.f.). Some recovery of wild-type (WT) mRNA is observed at 72 h.p.f.RT-PCR of ACTB mRNA on the same RNA samples as in FIG. 29g shows noeffect of morpholino injection at any time point (FIG. 29h). FIG. 29idiagrams the effect of spMO on invs mRNA processing. Preventing normalsplicing in the IQ2 domain recruits a cryptic splice donor in upstreaminvs coding sequence, the resulting out-of-frame fusion generates aC-terminally truncated invs mRNA at amino acid 696 with an altered 21amino acid C terminus (FIG. 29i). Rescue of normal morphology bycoinjected spMO and mouse Invs mRNA shows a normal pronephric ductstructure (Pt) (FIG. 29j) as compared to the absence of any effect whenthe Invs mRNA was injected alone. TABLE 3 Situs inversus Family EthnicNucleotide Alteration(s) in Parental Renal Renal Age at (other(individual) origin alteration(s)^(a) coding sequence Exonsegregation^(b) consanguinity cysts biopsy ESRD^(c) symptoms)^(d) A6France C2695T R899X 13, het^(e) − − + <2 y − 1453delC Q485fsX509 9,het^(e) A8 Turkey C1807T R603X 12, hom^(e) + − + 14 mo +(VSD^(f)) A9France C1186T R396X 8, het^(e) − + + <2 y − C1445G P482R 9, het PA10^(g) France 2908delG E970fsX971 14, het M − + + 12 mo − A12 (VII-1,VII-3) Israel C2719T R907X 13, hom M, P + + (+, +) (30 mo, 30 mo) −,−(HT, HT) 868 (II-1, II-2) USA C2719T R907X 13, het M − −^(h) (+, +) (5y, 4 y) −, −(HT, HT) 2747insA K916fsX1002 13, het P A7 Portugal T1478CL493S 10, hom^(e) + ND + 5 y −(HT)

[0343] All publications and patents mentioned in the above specificationare herein incorporated by reference. Various modifications andvariations of the described method and system of the invention will beapparent to those skilled in the art without departing from the scopeand spirit of the invention. Although the invention has been describedin connection with specific preferred embodiments, it should beunderstood that the invention as claimed should not be unduly limited tosuch specific embodiments. Indeed, various modifications of thedescribed modes for carrying out the invention that are obvious to thoseskilled in molecular biology, genetics, or related fields are intendedto be within the scope of the following claims.

1 102 1 4994 DNA Homo sapiens 1 gacgcgaggc gggttcttgg actgagtgtgcggcgcggtg cgccgccttc cgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgcccctcggccagt cctcggtcct caggcttgtg 120 gctccgttga gcaccggccg ccgggcctctgggtccgtcg agtggagact ctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgcggcgggtcgg tctcctagat catccgggaa 240 gcccacggga ccctcaggcg ggcaggatgaacgactggca caggatcttc acccaaaacg 300 tgcttgtccc tccccaccca cagagagcgcgccagccttg gaaggaatcc acggcattcc 360 agtgtgtcct caagtggctg gacggaccggtaattaggca gggcgtgctg gaggtactgt 420 cagaggttga atgccatctg cgagtgtctttctttgatgt cacctaccgg cacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccgacgaagagacc gccgtccagg atcgtcttta 540 atgagccctt gtattttcac acatccctaaaccaccctca tatcgtggct gtggtggaag 600 tggtcgctga gggcaagaaa cgggatgggagcctccagac attgtcctgt gggtttggaa 660 ttcttcggat cttcagcaac cagccggactctcctatctc tgcttcccag gacaaaaggt 720 tgcggctgta ccatggcacc cccagagccctcctgcaccc gcttctccag gaccccgcag 780 agcaaaacag acacatgacc ctcattgagaactgcagcct gcagtacacg ctgaagccac 840 acccggccct ggagcctgcg ttccaccttcttcctgagaa ccttctggtg tctggtctgc 900 agcagatacc tggcctgctt ccagctcatggagaatccgg cgacgctctc cgaaagcctc 960 gcctccagaa gcccatcacg gggcacttggatgacttatt cttcaccctg tacccctccc 1020 tggagaagtt tgaggaagag ctgctggagctccacgtcca ggaccacttc caggagggat 1080 gtggcccact ggacggtggt gccctggagatcctggagcg gcgcctgcgt gtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgcaggtcgttgt actggtgcct gagatggatg 1200 tggccttgac gcgctcagct agcttcagcaggaaagtggt ctcctcttcc aagaccagct 1260 ccgggagcca agctctggtt ttgagaagccgcctccgcct cccagagatg gtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagtacgtgttcag cagccctgca ggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtccaacctggcatg catgcacatg gtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgattctggaagggt gaccctgcct ctgcagggtg 1500 ggatccagcc caacccctcg cactgtctggtctacaaggt accctcagcc agcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggtacactccggtt ccagttctcg ctgggctcag 1620 aagaacacct ggatgcaccc acggagcctgtcagtggccc caaagtggag cggcggcctt 1680 ccaggaaacc acccacgtcc ccttcgagcccgccagcgcc agtacctcga gttctcgctg 1740 ccccgcagaa ctcacctgtg ggaccagggttgtcaatttc ccagctggcg gcctccccgc 1800 ggtccccgac tcagcactgc ttggccaggcctacttcaca gctaccccat ggctctcagg 1860 cctccccggc ccaggcacag gagttcccgttggaggccgg tatctcccac ctggaagccg 1920 acctgagcca gacctccctg gtcctggaaacatccattgc cgaacagtta caggagctgc 1980 cgttcacgcc tttgcatgcc cctattgttgtgggaaccca gaccaggagc tctgcagggc 2040 agccctcgag agcctccatg gtgctcctgcagtcctccgg ctttcccgag attctggatg 2100 ccaataaaca gccagccgag gctgtcagcgctacagaacc tgtgacgttt aaccctcaga 2160 aggaagaatc agattgtcta caaagcaacgagatggtgct acagtttctt gcctttagca 2220 gagtggccca ggactgccga ggaacatcatggccaaagac tgtgtatttc accttccagt 2280 tctaccgctt cccacccgca acgacgccacgactgcagct ggtccagctg gatgaggccg 2340 gccagcccag ctctggcgcc ctgacccacatcctcgtgcc tgtgagcaga gatggcacct 2400 ttgatgctgg gtctcctggc ttccagctgaggtacatggt gggccctggg ttcctgaagc 2460 caggtgagcg gcgctgcttt gcccgctacctggccgtgca gaccctgcag attgacgtct 2520 gggacggaga ctccctgctg ctcatcggatctgctgccgt ccagatgaag catctcctcc 2580 gccaaggccg gccggctgtg caggcctcccacgagcttga ggtcgtggca actgaatacg 2640 agcaggacaa catggtggtg agtggagacatgctggggtt tggccgcgtc aagcccatcg 2700 gcgtccactc ggtggtgaag ggccggctgcacctgacttt ggccaacgtg ggtcacccgt 2760 gtgaacagaa agtgagaggt tgtagcacattgccaccgtc cagatctcgg gtcatctcaa 2820 acgatggagc cagccgcttc tctggaggcagcctcctcac gactggaagc tcaaggcgaa 2880 aacacgtggt gcaagcacag aagctggcggacgtggacag tgagctggct gccatgctac 2940 tgacccatgc ccggcagggc aaggggccccaggacgtcag ccgcgagtcg gatgccaccc 3000 gcaggcgtaa gctggagcgg atgaggtctgtgcgcctgca ggaggccggg ggagacttgg 3060 gccggcgcgg gacgagcgtg ttggcgcagcagagcgtccg cacacagcac ttgcgggacc 3120 tacaggtcat cgccgcctac cgggaacgcacgaaggccga gagcatcgcc agcctgctga 3180 gcctggccat caccacggag cacacgctccacgccacgct gggggtcgcc gagttctttg 3240 agtttgtgct taagaacccc cacaacacacagcacacggt gactgtggag atcgacaacc 3300 ccgagctcag cgtcatcgtg gacagtcaggagtggaggga cttcaagggt gctgctggcc 3360 tgcacacacc ggtggaggag gacatgttccacctgcgtgg cagcctggcc ccccagctct 3420 acctgcgccc ccacgagacc gcccacgtccccttcaagtt ccagagcttc tctgcagggc 3480 agctggccat ggtgcaggcc tctcctgggttgagcaacga gaagggcatg gacgccgtgt 3540 caccttggaa gtccagcgca gtgcccactaaacacgccaa ggtcttgttc cgagcgagtg 3600 gtggcaagcc catcgccgtg ctctgcctgactgtggagct gcagccccac gtggtggacc 3660 aggtcttccg cttctatcac ccggagctctccttcctgaa gaaggccatc cgcctgccgc 3720 cctggcacac atttccaggt gctccggtgggaatgcttgg tgaggacccc ccagtccatg 3780 ttcgctgcag cgacccgaac gtcatctgtgagacccagaa tgtgggcccc ggggaaccac 3840 gggacatatt tctgaaggtg gccagtggtccaagcccgga gatcaaagac ttctttgtca 3900 tcatttactc ggatcgctgg ctggcgacacccacacagac gtggcaggtc tacctccact 3960 ccctgcagcg cgtggatgtc tcctgcgtcgcaggccagct gacccgcctg tcccttgtcc 4020 ttcgggggac acagacagtg aggaaagtgagagctttcac ctctcatccc caggagctga 4080 agacagaccc caaaggtgtc ttcgtgctgccgcctcgtgg ggtgcaggac ctgcatgttg 4140 gcgtgaggcc ccttagggcc ggcagccgctttgtccatct caacctggtg gacgtggatt 4200 gccaccagct ggtggcctcc tggctcgtgtgcctctgctg ccgccagccg ctcatctcca 4260 aggcctttga gatcatgttg gctgcgggcgaagggaaggg tgtcaacaag aggatcacct 4320 acaccaaccc ctacccctcc cggaggacattccacctgca cagcgaccac ccggagctgc 4380 tgcggttcag agaggactcc ttccaggtcgggggtggaga gacctacacc atcggcttgc 4440 agtttgcgcc tagtcagaga gtgggtgaggaggagatcct gatctacatc aatgaccatg 4500 aggacaaaaa cgaagaggca ttttgcgtgaaggtcatcta ccagtgaggg cttgagggtg 4560 acgtccttcc tgcggcaccc agctggggcctgtctgtgcc cctcctgccc tgcaggctgt 4620 cctccccgcc tctctgcagc ctttcacttcagtgcccacc tggctgacct gtgcacttgg 4680 ctgaggaagc agagaccgag cgctggtcattttgtagtac ctgcatccag cttagctgct 4740 gctgacaccc agcaggcctg ggttccgtgagcgcgaactc cgtggtggtg ggtctggctc 4800 tggtgctgcc atctacgcat gtgggaccctcgttatcgct gttgctcaaa atgtatttta 4860 tgaatcatcc taaatgagaa aattatgtttttcttactgg attttgtaca aacataatct 4920 attatttgct atgcaatatt ttatgctggtattatatctg ttttttaaat tgttgaacaa 4980 aatactaaac tttt 4994 2 1426 PRTHomo sapiens 2 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn Val Leu ValPro Pro 1 5 10 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser ThrAla Phe Gln 20 25 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val Ile Arg GlnGly Val Leu 35 40 45 Glu Val Leu Ser Glu Val Glu Cys His Leu Arg Val SerPhe Phe Asp 50 55 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys ThrThr Val Lys 65 70 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe AsnGlu Pro Leu Tyr 85 90 95 Phe His Thr Ser Leu Asn His Pro His Ile Val AlaVal Val Glu Val 100 105 110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser LeuGln Thr Leu Ser Cys 115 120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser AsnGln Pro Asp Ser Pro Ile 130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu ArgLeu Tyr His Gly Thr Pro Arg 145 150 155 160 Ala Leu Leu His Pro Leu LeuGln Asp Pro Ala Glu Gln Asn Arg His 165 170 175 Met Thr Leu Ile Glu AsnCys Ser Leu Gln Tyr Thr Leu Lys Pro His 180 185 190 Pro Ala Leu Glu ProAla Phe His Leu Leu Pro Glu Asn Leu Leu Val 195 200 205 Ser Gly Leu GlnGln Ile Pro Gly Leu Leu Pro Ala His Gly Glu Ser 210 215 220 Gly Asp AlaLeu Arg Lys Pro Arg Leu Gln Lys Pro Ile Thr Gly His 225 230 235 240 LeuAsp Asp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255Glu Glu Leu Leu Glu Leu His Val Gln Asp His Phe Gln Glu Gly Cys 260 265270 Gly Pro Leu Asp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275280 285 Val Gly Val His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val290 295 300 Val Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala SerPhe 305 310 315 320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr Ser Ser GlySer Gln Ala 325 330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu MetVal Gly His Pro 340 345 350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr ValPhe Ser Ser Pro Ala 355 360 365 Gly Val Asp Gly Asn Ala Ala Ser Val ThrSer Leu Ser Asn Leu Ala 370 375 380 Cys Met His Met Val Arg Trp Ala ValTrp Asn Pro Leu Leu Glu Ala 385 390 395 400 Asp Ser Gly Arg Val Thr LeuPro Leu Gln Gly Gly Ile Gln Pro Asn 405 410 415 Pro Ser His Cys Leu ValTyr Lys Val Pro Ser Ala Ser Met Ser Ser 420 425 430 Glu Glu Val Lys GlnVal Glu Ser Gly Thr Leu Arg Phe Gln Phe Ser 435 440 445 Leu Gly Ser GluGlu His Leu Asp Ala Pro Thr Glu Pro Val Ser Gly 450 455 460 Pro Lys ValGlu Arg Arg Pro Ser Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475 480 SerPro Pro Ala Pro Val Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495Pro Val Gly Pro Gly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505510 Ser Pro Thr Gln His Cys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515520 525 Gly Ser Gln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala530 535 540 Gly Ile Ser His Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu ValLeu 545 550 555 560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro PheThr Pro Leu 565 570 575 His Ala Pro Ile Val Val Gly Thr Gln Thr Arg SerSer Ala Gly Gln 580 585 590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln SerSer Gly Phe Pro Glu 595 600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala GluAla Val Ser Ala Thr Glu 610 615 620 Pro Val Thr Phe Asn Pro Gln Lys GluGlu Ser Asp Cys Leu Gln Ser 625 630 635 640 Asn Glu Met Val Leu Gln PheLeu Ala Phe Ser Arg Val Ala Gln Asp 645 650 655 Cys Arg Gly Thr Ser TrpPro Lys Thr Val Tyr Phe Thr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro ProAla Thr Thr Pro Arg Leu Gln Leu Val Gln Leu 675 680 685 Asp Glu Ala GlyGln Pro Ser Ser Gly Ala Leu Thr His Ile Leu Val 690 695 700 Pro Val SerArg Asp Gly Thr Phe Asp Ala Gly Ser Pro Gly Phe Gln 705 710 715 720 LeuArg Tyr Met Val Gly Pro Gly Phe Leu Lys Pro Gly Glu Arg Arg 725 730 735Cys Phe Ala Arg Tyr Leu Ala Val Gln Thr Leu Gln Ile Asp Val Trp 740 745750 Asp Gly Asp Ser Leu Leu Leu Ile Gly Ser Ala Ala Val Gln Met Lys 755760 765 His Leu Leu Arg Gln Gly Arg Pro Ala Val Gln Ala Ser His Glu Leu770 775 780 Glu Val Val Ala Thr Glu Tyr Glu Gln Asp Asn Met Val Val SerGly 785 790 795 800 Asp Met Leu Gly Phe Gly Arg Val Lys Pro Ile Gly ValHis Ser Val 805 810 815 Val Lys Gly Arg Leu His Leu Thr Leu Ala Asn ValGly His Pro Cys 820 825 830 Glu Gln Lys Val Arg Gly Cys Ser Thr Leu ProPro Ser Arg Ser Arg 835 840 845 Val Ile Ser Asn Asp Gly Ala Ser Arg PheSer Gly Gly Ser Leu Leu 850 855 860 Thr Thr Gly Ser Ser Arg Arg Lys HisVal Val Gln Ala Gln Lys Leu 865 870 875 880 Ala Asp Val Asp Ser Glu LeuAla Ala Met Leu Leu Thr His Ala Arg 885 890 895 Gln Gly Lys Gly Pro GlnAsp Val Ser Arg Glu Ser Asp Ala Thr Arg 900 905 910 Arg Arg Lys Leu GluArg Met Arg Ser Val Arg Leu Gln Glu Ala Gly 915 920 925 Gly Asp Leu GlyArg Arg Gly Thr Ser Val Leu Ala Gln Gln Ser Val 930 935 940 Arg Thr GlnHis Leu Arg Asp Leu Gln Val Ile Ala Ala Tyr Arg Glu 945 950 955 960 ArgThr Lys Ala Glu Ser Ile Ala Ser Leu Leu Ser Leu Ala Ile Thr 965 970 975Thr Glu His Thr Leu His Ala Thr Leu Gly Val Ala Glu Phe Phe Glu 980 985990 Phe Val Leu Lys Asn Pro His Asn Thr Gln His Thr Val Thr Val Glu 9951000 1005 Ile Asp Asn Pro Glu Leu Ser Val Ile Val Asp Ser Gln Glu Trp1010 1015 1020 Arg Asp Phe Lys Gly Ala Ala Gly Leu His Thr Pro Val GluGlu 1025 1030 1035 Asp Met Phe His Leu Arg Gly Ser Leu Ala Pro Gln LeuTyr Leu 1040 1045 1050 Arg Pro His Glu Thr Ala His Val Pro Phe Lys PheGln Ser Phe 1055 1060 1065 Ser Ala Gly Gln Leu Ala Met Val Gln Ala SerPro Gly Leu Ser 1070 1075 1080 Asn Glu Lys Gly Met Asp Ala Val Ser ProTrp Lys Ser Ser Ala 1085 1090 1095 Val Pro Thr Lys His Ala Lys Val LeuPhe Arg Ala Ser Gly Gly 1100 1105 1110 Lys Pro Ile Ala Val Leu Cys LeuThr Val Glu Leu Gln Pro His 1115 1120 1125 Val Val Asp Gln Val Phe ArgPhe Tyr His Pro Glu Leu Ser Phe 1130 1135 1140 Leu Lys Lys Ala Ile ArgLeu Pro Pro Trp His Thr Phe Pro Gly 1145 1150 1155 Ala Pro Val Gly MetLeu Gly Glu Asp Pro Pro Val His Val Arg 1160 1165 1170 Cys Ser Asp ProAsn Val Ile Cys Glu Thr Gln Asn Val Gly Pro 1175 1180 1185 Gly Glu ProArg Asp Ile Phe Leu Lys Val Ala Ser Gly Pro Ser 1190 1195 1200 Pro GluIle Lys Asp Phe Phe Val Ile Ile Tyr Ser Asp Arg Trp 1205 1210 1215 LeuAla Thr Pro Thr Gln Thr Trp Gln Val Tyr Leu His Ser Leu 1220 1225 1230Gln Arg Val Asp Val Ser Cys Val Ala Gly Gln Leu Thr Arg Leu 1235 12401245 Ser Leu Val Leu Arg Gly Thr Gln Thr Val Arg Lys Val Arg Ala 12501255 1260 Phe Thr Ser His Pro Gln Glu Leu Lys Thr Asp Pro Lys Gly Val1265 1270 1275 Phe Val Leu Pro Pro Arg Gly Val Gln Asp Leu His Val GlyVal 1280 1285 1290 Arg Pro Leu Arg Ala Gly Ser Arg Phe Val His Leu AsnLeu Val 1295 1300 1305 Asp Val Asp Cys His Gln Leu Val Ala Ser Trp LeuVal Cys Leu 1310 1315 1320 Cys Cys Arg Gln Pro Leu Ile Ser Lys Ala PheGlu Ile Met Leu 1325 1330 1335 Ala Ala Gly Glu Gly Lys Gly Val Asn LysArg Ile Thr Tyr Thr 1340 1345 1350 Asn Pro Tyr Pro Ser Arg Arg Thr PheHis Leu His Ser Asp His 1355 1360 1365 Pro Glu Leu Leu Arg Phe Arg GluAsp Ser Phe Gln Val Gly Gly 1370 1375 1380 Gly Glu Thr Tyr Thr Ile GlyLeu Gln Phe Ala Pro Ser Gln Arg 1385 1390 1395 Val Gly Glu Glu Glu IleLeu Ile Tyr Ile Asn Asp His Glu Asp 1400 1405 1410 Lys Asn Glu Glu AlaPhe Cys Val Lys Val Ile Tyr Gln 1415 1420 1425 3 1366 PRT Mus musculus 3Met Gly Asp Trp His Arg Ala Phe Thr Gln Asn Thr Leu Val Pro Pro 1 5 1015 His Pro Gln Arg Ala Arg Gln Leu Gly Lys Glu Ser Thr Ala Phe Gln 20 2530 Cys Ile Leu Lys Trp Leu Asp Gly Pro Leu Ile Lys Gln Gly Ile Leu 35 4045 Asp Met Leu Ser Glu Leu Glu Cys His Leu Arg Val Thr Leu Phe Asp 50 5560 Val Thr Tyr Lys His Phe Phe Gly Arg Thr Trp Lys Thr Thr Val Lys 65 7075 80 Pro Thr Asn Gln Pro Ser Lys Gln Pro Pro Arg Ile Thr Phe Asn Glu 8590 95 Pro Leu Tyr Phe His Thr Thr Leu Ser His Pro Ser Ile Val Ala Val100 105 110 Val Glu Val Val Thr Glu Gly Arg Lys Arg Asp Gly Thr Leu GlnLeu 115 120 125 Leu Ser Cys Gly Phe Gly Ile Leu Arg Ile Phe Gly Asn LysPro Glu 130 135 140 Ser Pro Thr Ser Ala Ala Gln Asp Lys Arg Leu Arg LeuTyr His Gly 145 150 155 160 Thr Pro Arg Ala Leu Leu His Pro Leu Leu GlnAsp Pro Ile Glu Gln 165 170 175 Asn Lys Phe Met Arg Leu Met Glu Asn CysSer Leu Gln Tyr Thr Leu 180 185 190 Lys Pro His Pro Pro Leu Glu Pro AlaPhe His Leu Leu Pro Glu Asn 195 200 205 Leu Leu Val Ser Gly Phe Gln GlnIle Pro Gly Leu Leu Pro Pro His 210 215 220 Gly Asp Thr Gly Asp Ala LeuArg Lys Pro Arg Phe Gln Lys Pro Thr 225 230 235 240 Thr Trp His Leu AspAsp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu 245 250 255 Lys Phe Glu GluGlu Leu Val Gln Leu Leu Ile Ser Asp Arg Glu Gly 260 265 270 Val Gly LeuLeu Asp Ser Gly Thr Leu Glu Val Leu Glu Arg Arg Leu 275 280 285 His ValCys Val His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val 290 295 300 ValVal Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser 305 310 315320 Phe Ser Arg Lys Ile Ser Ala Ser Ser Lys Asn Ser Ser Gly Asn Gln 325330 335 Ala Leu Val Leu Arg Ser His Leu Arg Leu Pro Glu Met Val Ser His340 345 350 Pro Ala Phe Ala Ile Val Phe Gln Leu Glu Tyr Val Phe Asn SerPro 355 360 365 Ser Gly Ala Asp Gly Gly Ala Ser Ser Pro Thr Ser Ile SerSer Val 370 375 380 Ala Cys Met His Met Val Arg Trp Ala Val Trp Asn ProAsp Leu Glu 385 390 395 400 Val Gly Pro Gly Lys Val Thr Leu Pro Leu GlnGly Gly Val Gln Gln 405 410 415 Asn Pro Ser Arg Cys Leu Val Tyr Lys ValPro Ser Ala Ser Met Ser 420 425 430 Ser Glu Glu Val Lys Gln Val Glu SerGly Thr Ile Gln Phe Gln Phe 435 440 445 Ser Leu Ser Ser Asp Gly Pro ThrGlu His Ala Asn Gly Pro Arg Val 450 455 460 Gly Arg Arg Ser Ser Arg LysMet Pro Ala Ser Pro Ser Gln Glu Ser 465 470 475 480 Val Leu Ser Glu ArgVal Ser His Leu Glu Ala Asp Leu Ser Gln Pro 485 490 495 Ala Ser Leu GlnGly Thr Pro Ala Val Glu His Leu Gln Glu Leu Pro 500 505 510 Phe Thr ProLeu His Ala Pro Ile Val Val Gly Ala Gln Thr Arg Ser 515 520 525 Ser ArgSer Gln Leu Ser Arg Ala Ala Met Val Leu Leu Gln Ser Ser 530 535 540 GlyPhe Pro Glu Ile Leu Asp Ala Ser Gln Gln Pro Val Glu Ala Val 545 550 555560 Asn Pro Ile Asp Pro Val Arg Phe Asn Pro Gln Lys Glu Glu Ser Asp 565570 575 Cys Leu Arg Gly Asn Glu Ile Val Leu Gln Phe Leu Ala Phe Ser Arg580 585 590 Ala Ala Gln Asp Cys Pro Gly Thr Pro Trp Pro Gln Thr Val TyrPhe 595 600 605 Thr Phe Gln Phe Tyr Arg Phe Pro Pro Glu Thr Thr Pro ArgLeu Gln 610 615 620 Leu Val Lys Leu Asp Gly Thr Gly Lys Ser Gly Ser GlySer Leu Ser 625 630 635 640 His Ile Leu Val Pro Ile Asn Lys Asp Gly SerPhe Asp Ala Gly Ser 645 650 655 Pro Gly Leu Gln Leu Arg Tyr Met Val AspPro Gly Phe Leu Lys Pro 660 665 670 Gly Glu Gln Arg Trp Phe Ala His TyrLeu Ala Ala Gln Thr Leu Gln 675 680 685 Val Asp Val Trp Asp Gly Asp SerLeu Leu Leu Ile Gly Ser Ala Gly 690 695 700 Val Gln Met Lys His Leu LeuArg Gln Gly Arg Pro Ala Val Gln Val 705 710 715 720 Ser His Glu Leu GluVal Val Ala Thr Glu Tyr Glu Gln Glu Met Met 725 730 735 Ala Val Ser GlyAsp Val Ala Gly Phe Gly Ser Val Lys Pro Ile Gly 740 745 750 Val His ThrVal Val Lys Gly Arg Leu His Leu Thr Leu Ala Asn Val 755 760 765 Gly HisAla Cys Glu Pro Arg Ala Arg Gly Ser Asn Leu Leu Pro Pro 770 775 780 SerArg Ser Arg Val Ile Ser Asn Asp Gly Ala Ser Phe Phe Ser Gly 785 790 795800 Gly Ser Leu Leu Ile Pro Gly Gly Pro Lys Arg Lys Arg Val Val Gln 805810 815 Ala Gln Arg Leu Ala Asp Val Asp Ser Glu Leu Ala Ala Met Leu Leu820 825 830 Thr His Thr Arg Ala Gly Gln Gly Pro Gln Ala Ala Gly Gln GluAla 835 840 845 Asp Ala Val His Lys Arg Lys Leu Glu Arg Met Arg Leu ValArg Leu 850 855 860 Gln Glu Ala Gly Gly Asp Ser Asp Ser Arg Arg Ile SerLeu Leu Ala 865 870 875 880 Gln His Ser Val Arg Ala Gln His Ser Arg AspLeu Gln Val Ile Asp 885 890 895 Ala Tyr Arg Glu Arg Thr Lys Ala Glu SerIle Ala Gly Val Leu Ser 900 905 910 Gln Ala Ile Thr Thr His His Thr LeuTyr Ala Thr Leu Gly Thr Ala 915 920 925 Glu Phe Phe Glu Phe Ala Leu LysAsn Pro His Asn Thr Gln His Thr 930 935 940 Val Ala Ile Glu Ile Asp SerPro Glu Leu Ser Ile Ile Leu Asp Ser 945 950 955 960 Gln Glu Trp Arg TyrPhe Lys Glu Ala Thr Gly Leu His Thr Pro Leu 965 970 975 Glu Glu Asp MetPhe His Leu Arg Gly Ser Leu Ala Pro Gln Leu Tyr 980 985 990 Leu Arg ProArg Glu Thr Ala His Ile Pro Leu Lys Phe Gln Ser Phe 995 1000 1005 SerVal Gly Pro Leu Ala Pro Thr Gln Ala Pro Ala Glu Val Ile 1010 1015 1020Thr Glu Lys Asp Ala Glu Ser Gly Pro Leu Trp Lys Cys Ser Ala 1025 10301035 Met Pro Thr Lys His Ala Lys Val Leu Phe Arg Val Glu Thr Gly 10401045 1050 Gln Leu Ile Ala Val Leu Cys Leu Thr Val Glu Pro Gln Pro His1055 1060 1065 Val Val Asp Gln Val Phe Arg Phe Tyr His Pro Glu Leu ThrPhe 1070 1075 1080 Leu Lys Lys Ala Ile Arg Leu Pro Pro Trp His Thr LeuPro Gly 1085 1090 1095 Ala Pro Val Gly Met Pro Gly Glu Asp Pro Pro ValHis Val Arg 1100 1105 1110 Cys Ser Asp Pro Asn Val Ile Cys Glu Ala GlnAsn Val Gly Pro 1115 1120 1125 Gly Glu Pro Arg Asp Val Phe Leu Lys ValAla Ser Gly Pro Ser 1130 1135 1140 Pro Glu Ile Lys Asp Phe Phe Val ValIle Tyr Ala Asp Arg Trp 1145 1150 1155 Leu Ala Val Pro Val Gln Thr TrpGln Val Cys Leu His Ser Leu 1160 1165 1170 Gln Arg Val Asp Val Ser CysVal Ala Gly Gln Leu Thr Arg Leu 1175 1180 1185 Ser Leu Val Leu Arg GlyThr Gln Thr Val Arg Lys Val Arg Ala 1190 1195 1200 Phe Thr Ser His ProGln Glu Leu Lys Thr Asp Pro Ala Gly Val 1205 1210 1215 Phe Val Leu ProPro His Gly Val Gln Asp Leu His Val Gly Val 1220 1225 1230 Arg Pro ArgArg Ala Gly Ser Arg Phe Val His Leu Asn Leu Val 1235 1240 1245 Asp IleAsp Tyr His Gln Leu Val Ala Ser Trp Leu Val Cys Leu 1250 1255 1260 SerCys Arg Gln Pro Leu Ile Ser Lys Ala Phe Glu Ile Thr Met 1265 1270 1275Ala Ala Gly Asp Glu Lys Gly Thr Asn Lys Arg Ile Thr Tyr Thr 1280 12851290 Asn Pro Tyr Pro Ser Arg Arg Thr Tyr Arg Leu His Ser Asp Arg 12951300 1305 Pro Glu Leu Leu Arg Phe Lys Glu Asp Ser Phe Gln Val Ala Gly1310 1315 1320 Gly Glu Thr Tyr Thr Ile Gly Leu Arg Phe Leu Pro Ser GlySer 1325 1330 1335 Ala Gly Gln Glu Glu Ile Leu Ile Tyr Ile Asn Asp HisGlu Asp 1340 1345 1350 Lys Asn Glu Glu Thr Phe Cys Val Lys Val Leu TyrGln 1355 1360 1365 4 1196 PRT Caenorhabditis elegans 4 Met Ser Val AsnAsp Trp Tyr Ser Leu Phe Leu Ala Asn Arg Pro Val 1 5 10 15 Glu Met LysArg Asn Val Ser Arg Gly Thr Lys Ala Leu Cys Tyr Ser 20 25 30 Met Phe IleSer Asn Leu Thr Ser Pro Gln Thr Leu Tyr Phe Tyr Ser 35 40 45 Ile Ile AsnSer Arg Asp Val Leu Leu Ile Leu Glu Phe Val Glu Glu 50 55 60 Gly Ser AspGlu Ile Asn Gly Arg Thr Phe Glu Asn Pro Lys Ser Thr 65 70 75 80 Lys IleThr Ala Pro Ala Thr Ser Val Gly Trp Phe Ser Thr His Ile 85 90 95 Glu LysLys Thr Pro Val Glu Ile Ser Asn Thr Lys Ile Phe Asp Ile 100 105 110 PheGly Gly Thr Pro Lys Leu Leu Ile Phe Asp Lys Glu Thr Val Leu 115 120 125Lys Pro Val Gly Asn Val Glu Cys Thr Tyr Asn Ile Phe Glu Met Pro 130 135140 Pro Ile Phe Phe Gln Cys Leu Pro Glu Phe Cys Ile Val Cys Asp Lys 145150 155 160 Asp Ile Ile Pro Gly Ile Ile Lys Asp Ser Ser Asp Glu Trp TrpLeu 165 170 175 Ser Thr Pro Lys Glu Met Pro Thr Ile Pro Ala Ala Ile AspAla Ile 180 185 190 Val Ile Gln Phe Lys Asn Asn Val Pro Glu Leu Glu LysGln Ile Thr 195 200 205 His Asp Ile Glu Lys Glu Trp Ala Leu Lys Glu GlyGly Thr Leu Lys 210 215 220 Pro Lys Ala Ile Ile Met Asp Arg Lys Leu ArgIle Gly Val His Asn 225 230 235 240 Gly Tyr Thr Tyr Val Thr Glu Pro PheThr Val Asp Leu Glu Ile Ile 245 250 255 Ser Ser Asn Ala Gly Asp Thr LeuArg Ser Arg Lys Lys Pro Ile Asp 260 265 270 Phe Gly Lys Ser Ser Asn TrpGlu Glu Gln Leu Leu Phe Gln Ala Ala 275 280 285 Gly Asn Pro Arg Leu AlaLeu Arg Asn Leu Tyr Ala Asp Pro Arg Met 290 295 300 Ala Ile Ile Phe LeuLeu Glu Tyr Thr Phe His Arg Glu Asp Asn Gln 305 310 315 320 Ser Leu AsnGln Thr Ile Leu Ile Gly Trp Ala Ala Trp Thr Pro Phe 325 330 335 Ser AspGly Ala Phe Ser Gly Lys Glu Val Glu Thr Arg Val Ser Phe 340 345 350 ValGly Gly Pro Arg Pro Asn Pro Glu Gly Val Leu Cys Tyr Lys Asn 355 360 365Val Leu Asn Gln Pro Asp Ser Leu Lys Pro Leu Asn Glu Lys Leu Glu 370 375380 Ile Phe Val Asp Phe Lys Phe Tyr Glu Asn Gly Arg Ser Val His Asn 385390 395 400 Thr Pro Thr Ser Arg Arg Ala Ala Asp Ser Ala Arg Val Gln ThrGly 405 410 415 Arg Ser Gly Asp Asn Gly Gln Ser Ala Arg Ser Asn Arg LysSer Val 420 425 430 Lys Ile Glu Thr Pro Arg Ser Pro Glu Asn Ser Asn ArgPhe Pro Ala 435 440 445 Leu Val Asp Thr Gly Arg Ser Val Ser Ser Val AspGlu Leu Arg Ser 450 455 460 Ile Asn Glu Asp Leu Asn Arg Phe Ile Glu GluPro Met Glu Ile Pro 465 470 475 480 Val Gln Asp Val Val Val Ala Lys LysPro Val Glu Glu Pro Leu Pro 485 490 495 Ile Thr Ser Val Tyr Lys Ile ProPhe Asp Glu Leu Lys Pro Ile Asn 500 505 510 Phe Pro Arg Ser Ala His SerMet Phe Ala Arg Gln Asn Phe Thr Gln 515 520 525 Leu Lys Asp Arg Asn GlySer Pro Pro Asn Thr Glu Asp Val Thr Leu 530 535 540 Lys Thr Ile Ile AspMet Lys Arg Glu Gln Leu Asp Arg Leu Ile Thr 545 550 555 560 Ser His ValTyr Phe Gln Phe Ile Ala Phe Lys Gln Leu Ala Ala Pro 565 570 575 Asp AlaArg Met Ile Lys Lys Leu Phe Phe Thr Ile Gly Phe Tyr Arg 580 585 590 PhePro Asp Ile Thr Thr Glu Ser Met Leu Leu Thr Ser Met Glu Lys 595 600 605Gly Glu Pro Thr Leu Leu Thr Arg Leu Asp Lys Asn Gly Asn Ser Asp 610 615620 Val Ile Ala Ser Pro Gly Phe Ile Ala Lys Tyr Ile Ile Glu Gly Glu 625630 635 640 Glu Ser Lys Ala Asp Phe Leu Asp Phe Met Ala Ser Gly His AlaThr 645 650 655 Ile Asp Val Trp Asp Ser Asp Ser Leu Ile His Leu Gly SerThr Ile 660 665 670 Val Pro Ile Lys Asn Leu Tyr Arg Arg Gly Arg Glu AlaVal Gln Leu 675 680 685 Phe Ile Gln Cys Pro Val Val Asp Thr Ser Leu AspThr Ser Ser Lys 690 695 700 Ala Gly Ala Phe Leu Tyr Met Arg Val Ala AsnIle Gly Phe Pro Ser 705 710 715 720 Gly Asn Thr Tyr Asp Leu Ser Ser SerSer Ser Ser Leu Thr Thr Thr 725 730 735 Arg Ser Asn Val Asn Ser Gly GlnGly Thr Val Val Arg Arg Leu Thr 740 745 750 Ser Ser Ile Arg Leu Asn GluGlu Gly Pro His Ser Tyr Arg Ile His 755 760 765 Ala Lys Pro Leu Pro GlyAsn Ser Gly Val Gly Leu Asp Arg Phe Leu 770 775 780 Thr Ala Gln Arg LeuAsp Ile Gln Gln Arg His Glu Gln Leu Phe Asn 785 790 795 800 Glu Asn SerLeu Asp Lys Ile Arg Gln Trp Asn Asp Leu Lys Glu Gly 805 810 815 Phe AsnPhe Ser Asp Asn Lys Glu Ile Ala Gln Lys Phe Ile Phe Glu 820 825 830 GluGlu Leu Ala Ala Tyr Lys Lys Leu Arg Tyr Glu Ser Lys Pro Ala 835 840 845Lys Leu Leu Glu Ala Val Phe Lys Gly Ile Thr Ser Cys His Gln Ile 850 855860 Asn Pro Ser Phe Gly Glu Lys Val Phe Phe Glu Phe Pro Leu Glu Asn 865870 875 880 Tyr Asn Ser Glu Pro Ile Asn Cys Thr Ile Glu Phe Asp Asp GluAla 885 890 895 Leu Lys Pro Val Phe Asp Ala Glu Glu Trp Lys Phe Tyr LysThr Val 900 905 910 Asn Lys Val Thr Thr Pro Ser Glu Lys Gln Met Met ArgGln Thr Thr 915 920 925 Asp Arg Ile Glu Ile Cys Leu Gln Pro Gly Asp ValLeu Phe Ile Pro 930 935 940 Phe Ile Tyr Asp Ala Phe Phe Phe Pro Asn AspAla Phe Asn Met Tyr 945 950 955 960 Ser Thr Lys Val Val Phe Arg Arg TrpAsp Thr Lys Glu Pro Leu Ala 965 970 975 Ile Leu Asp Leu His Val His ArgArg Asn Phe Leu Leu Gln His Ser 980 985 990 Val Thr Phe Ile Cys Glu ThrSer Gly Asn Trp Glu Lys Gln Leu Val 995 1000 1005 Leu Pro Pro Met AlaArg Asp Arg Arg Val Leu Ser Cys Arg Cys 1010 1015 1020 Ser Asp Pro SerVal Arg Leu Thr Val Arg Asn Ala Thr Leu Gln 1025 1030 1035 Gln Ile ValGly Phe Thr Thr Tyr Ser Gly Glu Thr Asn Asp Arg 1040 1045 1050 Lys ThrPhe Leu Leu Leu Met Tyr Ser Asp His Tyr Gln Thr Arg 1055 1060 1065 LeuMet Ala Thr Trp Lys Ile Thr Ile Leu Pro Phe Phe Asn Val 1070 1075 1080Asp Val Arg Ser Ile Val Gly Gln Thr Thr Arg Leu His Leu Leu 1085 10901095 Val His Arg Arg Ser Glu His Asp Gly Val Pro Asp Asp Leu Leu 11001105 1110 Lys Val Tyr Thr Ala Ser Gly Cys Met Lys Val Val Asp Ser Val1115 1120 1125 Leu Thr Glu Arg Thr Pro Thr Ala Thr Ile Asp Phe Thr ProAsn 1130 1135 1140 Phe Ile Gly Thr Lys Lys Leu Val Val Ser Val Val AsnThr Asn 1145 1150 1155 Thr Leu Lys Leu Glu Arg Gly Phe Leu Val Tyr GlyLys Ser Glu 1160 1165 1170 Ala Pro Arg Ile Thr Gln Lys Phe Val Ile GlnIle Pro Ser Ser 1175 1180 1185 Asp Glu Ala Ile Arg Lys Val Cys 1190 11955 2603 DNA Homo sapiens 5 gacgcgaggc gggttcttgg actgagtgtg cggcgcggtgcgccgccttc cgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgccc ctcggccagtcctcggtcct caggcttgtg 120 gctccgttga gcaccggccg ccgggcctct gggtccgtcgagtggagact ctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcggtctcctagat catccgggaa 240 gcccacggga ccctcaggcg ggcaggatga acgactggcacaggatcttc acccaaaacg 300 tgcttgtccc tccccaccca cagagagcgc gccagccttggaaggaatcc acggcattcc 360 agtgtgtcct caagtggctg gacggaccgg taattaggcagggcgtgctg gaggtactgt 420 cagaggttga atgccatctg cgagtgtctt tctttgatgtcacctaccgg cacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccga cgaagagaccgccgtccagg atcgtcttta 540 atgagccctt gtattttcac acatccctaa accaccctcatatcgtggct gtggtggaag 600 tggtcgctga gggcaagaaa cgggatggga gcctccagacattgtcctgt gggtttggaa 660 ttcttcggat cttcagcaac cagccggact ctcctatctctgcttcccag gacaaaaggt 720 tgcggctgta ccatggcacc cccagagccc tcctgcacccgcttctccag gaccccgcag 780 agcaaaacag acacatgacc ctcattgaga actgcagcctgcagtacacg ctgaagccac 840 acccggccct ggagcctgcg ttccaccttc ttcctgagaaccttctggtg tctggtctgc 900 agcagatacc tggcctgctt ccagctcatg gagaatccggcgacgctctc cgaaagcctc 960 gcctccagaa gcccatcacg gggcacttgg atgacttattcttcaccctg tacccctccc 1020 tggagaagtt tgaggaagag ctgctggagc tccacgtccaggaccacttc caggagggat 1080 gtggcccact ggacggtggt gccctggaga tcctggagcggcgcctgcgt gtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgc aggtcgttgtactggtgcct gagatggatg 1200 tggccttgac gcgctcagct agcttcagca ggaaagtggtctcctcttcc aagaccagct 1260 ccgggagcca agctctggtt ttgagaagcc gcctccgcctcccagagatg gtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagt acgtgttcagcagccctgca ggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtcca acctggcatgcatgcacatg gtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgatt ctggaagggtgaccctgcct ctgcagggtg 1500 ggatccagcc caacccctcg cactgtctgg tctacaaggtaccctcagcc agcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggta cactccggttccagttctcg ctgggctcag 1620 aagaacacct ggatgcaccc acggagcctg tcagtggccccaaagtggag cggcggcctt 1680 ccaggaaacc acccacgtcc ccttcgagcc cgccagcgccagtacctcga gttctcgctg 1740 ccccgcagaa ctcacctgtg ggaccagggt tgtcaatttcccagctggcg gcctccccgc 1800 ggtccccgac tcagcactgc ttggccaggc ctacttcacagctaccccat ggctctcagg 1860 cctccccggc ccaggcacag gagttcccgt tggaggccggtatctcccac ctggaagccg 1920 acctgagcca gacctccctg gtcctggaaa catccattgccgaacagtta caggagctgc 1980 cgttcacgcc tttgcatgcc cctattgttg tgggaacccagaccaggagc tctgcagggc 2040 agccctcgag agcctccatg gtgctcctgc agtcctccggctttcccgag attctggatg 2100 ccaataaaca gccagccgag gctgtcagcg ctacagaacctgtgacgttt aaccctcaga 2160 aggaagaatc agattgtcta caaagcaacg agatggtgctacagtttctt gcctttagca 2220 gagtggccca ggactgccga ggaacatcat ggccaaagactgtgtatttc accttccagt 2280 tctaccgctt cccacccgca acgacgccac gactgcagctggtccagctg gatgaggccg 2340 gccagcccag ctctggcgcc ctgacccaca tcctcgtgcctgtgagcaga gatggcacct 2400 ttgatgctgg gtctcctggc ttccagctga ggtacatggtgggccctggg ttcctgaagc 2460 caggtgagcg gcgctgcttt gcccgctacc tggccgtgcagaccctgcag attgacgtct 2520 gggacggaga ctccctgctg ctcatcggat ctgctgccgtccagatgaag catctcctcc 2580 gccaaggccg gccggctgtg tag 2603 6 779 PRT Homosapiens misc_feature (779)..(779) Xaa can be any naturally occurringamino acid 6 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn Val Leu Val ProPro 1 5 10 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser Thr AlaPhe Gln 20 25 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val Ile Arg Gln GlyVal Leu 35 40 45 Glu Val Leu Ser Glu Val Glu Cys His Leu Arg Val Ser PhePhe Asp 50 55 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys Thr ThrVal Lys 65 70 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe Asn GluPro Leu Tyr 85 90 95 Phe His Thr Ser Leu Asn His Pro His Ile Val Ala ValVal Glu Val 100 105 110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser Leu GlnThr Leu Ser Cys 115 120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser Asn GlnPro Asp Ser Pro Ile 130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu Arg LeuTyr His Gly Thr Pro Arg 145 150 155 160 Ala Leu Leu His Pro Leu Leu GlnAsp Pro Ala Glu Gln Asn Arg His 165 170 175 Met Thr Leu Ile Glu Asn CysSer Leu Gln Tyr Thr Leu Lys Pro His 180 185 190 Pro Ala Leu Glu Pro AlaPhe His Leu Leu Pro Glu Asn Leu Leu Val 195 200 205 Ser Gly Leu Gln GlnIle Pro Gly Leu Leu Pro Ala His Gly Glu Ser 210 215 220 Gly Asp Ala LeuArg Lys Pro Arg Leu Gln Lys Pro Ile Thr Gly His 225 230 235 240 Leu AspAsp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255 GluGlu Leu Leu Glu Leu His Val Gln Asp His Phe Gln Glu Gly Cys 260 265 270Gly Pro Leu Asp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275 280285 Val Gly Val His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val 290295 300 Val Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser Phe305 310 315 320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr Ser Ser Gly SerGln Ala 325 330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu Met ValGly His Pro 340 345 350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr Val PheSer Ser Pro Ala 355 360 365 Gly Val Asp Gly Asn Ala Ala Ser Val Thr SerLeu Ser Asn Leu Ala 370 375 380 Cys Met His Met Val Arg Trp Ala Val TrpAsn Pro Leu Leu Glu Ala 385 390 395 400 Asp Ser Gly Arg Val Thr Leu ProLeu Gln Gly Gly Ile Gln Pro Asn 405 410 415 Pro Ser His Cys Leu Val TyrLys Val Pro Ser Ala Ser Met Ser Ser 420 425 430 Glu Glu Val Lys Gln ValGlu Ser Gly Thr Leu Arg Phe Gln Phe Ser 435 440 445 Leu Gly Ser Glu GluHis Leu Asp Ala Pro Thr Glu Pro Val Ser Gly 450 455 460 Pro Lys Val GluArg Arg Pro Ser Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475 480 Ser ProPro Ala Pro Val Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495 ProVal Gly Pro Gly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505 510Ser Pro Thr Gln His Cys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515 520525 Gly Ser Gln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala 530535 540 Gly Ile Ser His Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu Val Leu545 550 555 560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro Phe ThrPro Leu 565 570 575 His Ala Pro Ile Val Val Gly Thr Gln Thr Arg Ser SerAla Gly Gln 580 585 590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln Ser SerGly Phe Pro Glu 595 600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala Glu AlaVal Ser Ala Thr Glu 610 615 620 Pro Val Thr Phe Asn Pro Gln Lys Glu GluSer Asp Cys Leu Gln Ser 625 630 635 640 Asn Glu Met Val Leu Gln Phe LeuAla Phe Ser Arg Val Ala Gln Asp 645 650 655 Cys Arg Gly Thr Ser Trp ProLys Thr Val Tyr Phe Thr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro Pro AlaThr Thr Pro Arg Leu Gln Leu Val Gln Leu 675 680 685 Asp Glu Ala Gly GlnPro Ser Ser Gly Ala Leu Thr His Ile Leu Val 690 695 700 Pro Val Ser ArgAsp Gly Thr Phe Asp Ala Gly Ser Pro Gly Phe Gln 705 710 715 720 Leu ArgTyr Met Val Gly Pro Gly Phe Leu Lys Pro Gly Glu Arg Arg 725 730 735 CysPhe Ala Arg Tyr Leu Ala Val Gln Thr Leu Gln Ile Asp Val Trp 740 745 750Asp Gly Asp Ser Leu Leu Leu Ile Gly Ser Ala Ala Val Gln Met Lys 755 760765 His Leu Leu Arg Gln Gly Arg Pro Ala Val Xaa 770 775 7 4994 DNA Homosapiens 7 gacgcgaggc gggttcttgg actgagtgtg cggcgcggtg cgccgccttccgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgccc ctcggccagt cctcggtcctcaggcttgtg 120 gctccgttga gcaccggccg ccgggcctct gggtccgtcg agtggagactctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcgg tctcctagatcatccgggaa 240 gcccacggga ccctcaggcg ggcaggatga acgactggca caggatcttcacccaaaacg 300 tgcttgtccc tccccaccca cagagagcgc gccagccttg gaaggaatccacggcattcc 360 agtgtgtcct caagtggctg gacggaccgg taattaggca gggcgtgctggaggtactgt 420 cagaggttga atgccatctg cgagtgtctt tctttgatgt cacctaccggcacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccga cgaagagacc gccgtccaggatcgtcttta 540 atgagccctt gtattttcac acatccctaa accaccctca tatcgtggctgtggtggaag 600 tggtcgctga gggcaagaaa cgggatggga gcctccagac attgtcctgtgggtttggaa 660 ttcttcggat cttcagcaac cagccggact ctcctatctc tgcttcccaggacaaaaggt 720 tgcggctgta ccatggcacc cccagagccc tcctgcaccc gcttctccaggaccccgcag 780 agcaaaacag acacatgacc ctcattgaga actgcagcct gcagtacacgctgaagccac 840 acccggccct ggagcctgcg ttccaccttc ttcctgagaa ccttctggtgtctggtctgc 900 agcagatacc tggcctgctt ccagctcatg gagaatccgg cgacgctctccgaaagcctc 960 gcctccagaa gcccatcacg gggcacttgg atgacttatt cttcaccctgtacccctccc 1020 tggagaagtt tgaggaagag ctgctggagc tccacgtcca ggaccacttccaggagggat 1080 gtggcccact ggacggtggt gccctggaga tcctggagcg gcgcctgcgtgtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgc aggtcgttgt actggtgcctgagatggatg 1200 tggccttgac gcgctcagct agcttcagca ggaaagtggt ctcctcttccaagaccagct 1260 ccgggagcca agctctggtt ttgagaagcc gcctccgcct cccagagatggtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagt acgtgttcag cagccctgcaggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtcca acctggcatg catgcacatggtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgatt ctggaagggt gaccctgcctctgcagggtg 1500 ggatccagcc caacccctcg cactgtctgg tctacaaggt accctcagccagcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggta cactccggtt ccagttctcgctgggctcag 1620 aagaacacct ggatgcaccc acggagcctg tcagtggccc caaagtggagcggcggcctt 1680 ccaggaaacc acccacgtcc ccttcgagcc cgccagcgcc agtacctcgagttctcgctg 1740 ccccgcagaa ctcacctgtg ggaccagggt tgtcaatttc ccagctggcggcctccccgc 1800 ggtccccgac tcagcactgc ttggccaggc ctacttcaca gctaccccatggctctcagg 1860 cctccccggc ccaggcacag gagttcccgt tggaggccgg tatctcccacctggaagccg 1920 acctgagcca gacctccctg gtcctggaaa catccattgc cgaacagttacaggagctgc 1980 cgttcacgcc tttgcatgcc cctattgttg tgggaaccca gaccaggagctctgcagggc 2040 agccctcgag agcctccatg gtgctcctgc agtcctccgg ctttcccgagattctggatg 2100 ccaataaaca gccagccgag gctgtcagcg ctacagaacc tgtgacgtttaaccctcaga 2160 aggaagaatc agattgtcta caaagcaacg agatggtgct acagtttcttgcctttagca 2220 gagtggccca ggactgccga ggaacatcat ggccaaagac tgtgtatttcaccttccagt 2280 tctaccgctt cccacccgca acgacgccac gactgcagct ggtccagctggatgaggccg 2340 gccagcccag ctctggcgcc ctgacccaca tcctcgtgcc tgtgagcagagatggcacct 2400 ttgatgctgg gtctcctggc ttccagctga ggtacatggt gggccctgggttcctgaagc 2460 caggtgagcg gcgctgcttt gcccgctacc tggccgtgca gaccctgcagattgacgtct 2520 gggacagaga ctccctgctg ctcatcggat ctgctgccgt ccagatgaagcatctcctcc 2580 gccaaggccg gccggctgtg caggcctccc acgagcttga ggtcgtggcaactgaatacg 2640 agcaggacaa catggtggtg agtggagaca tgctggggtt tggccgcgtcaagcccatcg 2700 gcgtccactc ggtggtgaag ggccggctgc acctgacttt ggccaacgtgggtcacccgt 2760 gtgaacagaa agtgagaggt tgtagcacat tgccaccgtc cagatctcgggtcatctcaa 2820 acgatggagc cagccgcttc tctggaggca gcctcctcac gactggaagctcaaggcgaa 2880 aacacgtggt gcaagcacag aagctggcgg acgtggacag tgagctggctgccatgctac 2940 tgacccatgc ccggcagggc aaggggcccc aggacgtcag ccgcgagtcggatgccaccc 3000 gcaggcgtaa gctggagcgg atgaggtctg tgcgcctgca ggaggccgggggagacttgg 3060 gccggcgcgg gacgagcgtg ttggcgcagc agagcgtccg cacacagcacttgcgggacc 3120 tacaggtcat cgccgcctac cgggaacgca cgaaggccga gagcatcgccagcctgctga 3180 gcctggccat caccacggag cacacgctcc acgccacgct gggggtcgccgagttctttg 3240 agtttgtgct taagaacccc cacaacacac agcacacggt gactgtggagatcgacaacc 3300 ccgagctcag cgtcatcgtg gacagtcagg agtggaggga cttcaagggtgctgctggcc 3360 tgcacacacc ggtggaggag gacatgttcc acctgcgtgg cagcctggccccccagctct 3420 acctgcgccc ccacgagacc gcccacgtcc ccttcaagtt ccagagcttctctgcagggc 3480 agctggccat ggtgcaggcc tctcctgggt tgagcaacga gaagggcatggacgccgtgt 3540 caccttggaa gtccagcgca gtgcccacta aacacgccaa ggtcttgttccgagcgagtg 3600 gtggcaagcc catcgccgtg ctctgcctga ctgtggagct gcagccccacgtggtggacc 3660 aggtcttccg cttctatcac ccggagctct ccttcctgaa gaaggccatccgcctgccgc 3720 cctggcacac atttccaggt gctccggtgg gaatgcttgg tgaggaccccccagtccatg 3780 ttcgctgcag cgacccgaac gtcatctgtg agacccagaa tgtgggccccggggaaccac 3840 gggacatatt tctgaaggtg gccagtggtc caagcccgga gatcaaagacttctttgtca 3900 tcatttactc ggatcgctgg ctggcgacac ccacacagac gtggcaggtctacctccact 3960 ccctgcagcg cgtggatgtc tcctgcgtcg caggccagct gacccgcctgtcccttgtcc 4020 ttcgggggac acagacagtg aggaaagtga gagctttcac ctctcatccccaggagctga 4080 agacagaccc caaaggtgtc ttcgtgctgc cgcctcgtgg ggtgcaggacctgcatgttg 4140 gcgtgaggcc ccttagggcc ggcagccgct ttgtccatct caacctggtggacgtggatt 4200 gccaccagct ggtggcctcc tggctcgtgt gcctctgctg ccgccagccgctcatctcca 4260 aggcctttga gatcatgttg gctgcgggcg aagggaaggg tgtcaacaagaggatcacct 4320 acaccaaccc ctacccctcc cggaggacat tccacctgca cagcgaccacccggagctgc 4380 tgcggttcag agaggactcc ttccaggtcg ggggtggaga gacctacaccatcggcttgc 4440 agtttgcgcc tagtcagaga gtgggtgagg aggagatcct gatctacatcaatgaccatg 4500 aggacaaaaa cgaagaggca ttttgcgtga aggtcatcta ccagtgagggcttgagggtg 4560 acgtccttcc tgcggcaccc agctggggcc tgtctgtgcc cctcctgccctgcaggctgt 4620 cctccccgcc tctctgcagc ctttcacttc agtgcccacc tggctgacctgtgcacttgg 4680 ctgaggaagc agagaccgag cgctggtcat tttgtagtac ctgcatccagcttagctgct 4740 gctgacaccc agcaggcctg ggttccgtga gcgcgaactc cgtggtggtgggtctggctc 4800 tggtgctgcc atctacgcat gtgggaccct cgttatcgct gttgctcaaaatgtatttta 4860 tgaatcatcc taaatgagaa aattatgttt ttcttactgg attttgtacaaacataatct 4920 attatttgct atgcaatatt ttatgctggt attatatctg ttttttaaattgttgaacaa 4980 aatactaaac tttt 4994 8 1426 PRT Homo sapiens 8 Met AsnAsp Trp His Arg Ile Phe Thr Gln Asn Val Leu Val Pro Pro 1 5 10 15 HisPro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser Thr Ala Phe Gln 20 25 30 CysVal Leu Lys Trp Leu Asp Gly Pro Val Ile Arg Gln Gly Val Leu 35 40 45 GluVal Leu Ser Glu Val Glu Cys His Leu Arg Val Ser Phe Phe Asp 50 55 60 ValThr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys Thr Thr Val Lys 65 70 75 80Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe Asn Glu Pro Leu Tyr 85 90 95Phe His Thr Ser Leu Asn His Pro His Ile Val Ala Val Val Glu Val 100 105110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser Leu Gln Thr Leu Ser Cys 115120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser Asn Gln Pro Asp Ser Pro Ile130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu Arg Leu Tyr His Gly Thr ProArg 145 150 155 160 Ala Leu Leu His Pro Leu Leu Gln Asp Pro Ala Glu GlnAsn Arg His 165 170 175 Met Thr Leu Ile Glu Asn Cys Ser Leu Gln Tyr ThrLeu Lys Pro His 180 185 190 Pro Ala Leu Glu Pro Ala Phe His Leu Leu ProGlu Asn Leu Leu Val 195 200 205 Ser Gly Leu Gln Gln Ile Pro Gly Leu LeuPro Ala His Gly Glu Ser 210 215 220 Gly Asp Ala Leu Arg Lys Pro Arg LeuGln Lys Pro Ile Thr Gly His 225 230 235 240 Leu Asp Asp Leu Phe Phe ThrLeu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255 Glu Glu Leu Leu Glu LeuHis Val Gln Asp His Phe Gln Glu Gly Cys 260 265 270 Gly Pro Leu Asp GlyGly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275 280 285 Val Gly Val HisAsn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val 290 295 300 Val Leu ValPro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser Phe 305 310 315 320 SerArg Lys Val Val Ser Ser Ser Lys Thr Ser Ser Gly Ser Gln Ala 325 330 335Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu Met Val Gly His Pro 340 345350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr Val Phe Ser Ser Pro Ala 355360 365 Gly Val Asp Gly Asn Ala Ala Ser Val Thr Ser Leu Ser Asn Leu Ala370 375 380 Cys Met His Met Val Arg Trp Ala Val Trp Asn Pro Leu Leu GluAla 385 390 395 400 Asp Ser Gly Arg Val Thr Leu Pro Leu Gln Gly Gly IleGln Pro Asn 405 410 415 Pro Ser His Cys Leu Val Tyr Lys Val Pro Ser AlaSer Met Ser Ser 420 425 430 Glu Glu Val Lys Gln Val Glu Ser Gly Thr LeuArg Phe Gln Phe Ser 435 440 445 Leu Gly Ser Glu Glu His Leu Asp Ala ProThr Glu Pro Val Ser Gly 450 455 460 Pro Lys Val Glu Arg Arg Pro Ser ArgLys Pro Pro Thr Ser Pro Ser 465 470 475 480 Ser Pro Pro Ala Pro Val ProArg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495 Pro Val Gly Pro Gly LeuSer Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505 510 Ser Pro Thr Gln HisCys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515 520 525 Gly Ser Gln AlaSer Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala 530 535 540 Gly Ile SerHis Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu Val Leu 545 550 555 560 GluThr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro Phe Thr Pro Leu 565 570 575His Ala Pro Ile Val Val Gly Thr Gln Thr Arg Ser Ser Ala Gly Gln 580 585590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln Ser Ser Gly Phe Pro Glu 595600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala Glu Ala Val Ser Ala Thr Glu610 615 620 Pro Val Thr Phe Asn Pro Gln Lys Glu Glu Ser Asp Cys Leu GlnSer 625 630 635 640 Asn Glu Met Val Leu Gln Phe Leu Ala Phe Ser Arg ValAla Gln Asp 645 650 655 Cys Arg Gly Thr Ser Trp Pro Lys Thr Val Tyr PheThr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro Pro Ala Thr Thr Pro Arg LeuGln Leu Val Gln Leu 675 680 685 Asp Glu Ala Gly Gln Pro Ser Ser Gly AlaLeu Thr His Ile Leu Val 690 695 700 Pro Val Ser Arg Asp Gly Thr Phe AspAla Gly Ser Pro Gly Phe Gln 705 710 715 720 Leu Arg Tyr Met Val Gly ProGly Phe Leu Lys Pro Gly Glu Arg Arg 725 730 735 Cys Phe Ala Arg Tyr LeuAla Val Gln Thr Leu Gln Ile Asp Val Trp 740 745 750 Asp Arg Asp Ser LeuLeu Leu Ile Gly Ser Ala Ala Val Gln Met Lys 755 760 765 His Leu Leu ArgGln Gly Arg Pro Ala Val Gln Ala Ser His Glu Leu 770 775 780 Glu Val ValAla Thr Glu Tyr Glu Gln Asp Asn Met Val Val Ser Gly 785 790 795 800 AspMet Leu Gly Phe Gly Arg Val Lys Pro Ile Gly Val His Ser Val 805 810 815Val Lys Gly Arg Leu His Leu Thr Leu Ala Asn Val Gly His Pro Cys 820 825830 Glu Gln Lys Val Arg Gly Cys Ser Thr Leu Pro Pro Ser Arg Ser Arg 835840 845 Val Ile Ser Asn Asp Gly Ala Ser Arg Phe Ser Gly Gly Ser Leu Leu850 855 860 Thr Thr Gly Ser Ser Arg Arg Lys His Val Val Gln Ala Gln LysLeu 865 870 875 880 Ala Asp Val Asp Ser Glu Leu Ala Ala Met Leu Leu ThrHis Ala Arg 885 890 895 Gln Gly Lys Gly Pro Gln Asp Val Ser Arg Glu SerAsp Ala Thr Arg 900 905 910 Arg Arg Lys Leu Glu Arg Met Arg Ser Val ArgLeu Gln Glu Ala Gly 915 920 925 Gly Asp Leu Gly Arg Arg Gly Thr Ser ValLeu Ala Gln Gln Ser Val 930 935 940 Arg Thr Gln His Leu Arg Asp Leu GlnVal Ile Ala Ala Tyr Arg Glu 945 950 955 960 Arg Thr Lys Ala Glu Ser IleAla Ser Leu Leu Ser Leu Ala Ile Thr 965 970 975 Thr Glu His Thr Leu HisAla Thr Leu Gly Val Ala Glu Phe Phe Glu 980 985 990 Phe Val Leu Lys AsnPro His Asn Thr Gln His Thr Val Thr Val Glu 995 1000 1005 Ile Asp AsnPro Glu Leu Ser Val Ile Val Asp Ser Gln Glu Trp 1010 1015 1020 Arg AspPhe Lys Gly Ala Ala Gly Leu His Thr Pro Val Glu Glu 1025 1030 1035 AspMet Phe His Leu Arg Gly Ser Leu Ala Pro Gln Leu Tyr Leu 1040 1045 1050Arg Pro His Glu Thr Ala His Val Pro Phe Lys Phe Gln Ser Phe 1055 10601065 Ser Ala Gly Gln Leu Ala Met Val Gln Ala Ser Pro Gly Leu Ser 10701075 1080 Asn Glu Lys Gly Met Asp Ala Val Ser Pro Trp Lys Ser Ser Ala1085 1090 1095 Val Pro Thr Lys His Ala Lys Val Leu Phe Arg Ala Ser GlyGly 1100 1105 1110 Lys Pro Ile Ala Val Leu Cys Leu Thr Val Glu Leu GlnPro His 1115 1120 1125 Val Val Asp Gln Val Phe Arg Phe Tyr His Pro GluLeu Ser Phe 1130 1135 1140 Leu Lys Lys Ala Ile Arg Leu Pro Pro Trp HisThr Phe Pro Gly 1145 1150 1155 Ala Pro Val Gly Met Leu Gly Glu Asp ProPro Val His Val Arg 1160 1165 1170 Cys Ser Asp Pro Asn Val Ile Cys GluThr Gln Asn Val Gly Pro 1175 1180 1185 Gly Glu Pro Arg Asp Ile Phe LeuLys Val Ala Ser Gly Pro Ser 1190 1195 1200 Pro Glu Ile Lys Asp Phe PheVal Ile Ile Tyr Ser Asp Arg Trp 1205 1210 1215 Leu Ala Thr Pro Thr GlnThr Trp Gln Val Tyr Leu His Ser Leu 1220 1225 1230 Gln Arg Val Asp ValSer Cys Val Ala Gly Gln Leu Thr Arg Leu 1235 1240 1245 Ser Leu Val LeuArg Gly Thr Gln Thr Val Arg Lys Val Arg Ala 1250 1255 1260 Phe Thr SerHis Pro Gln Glu Leu Lys Thr Asp Pro Lys Gly Val 1265 1270 1275 Phe ValLeu Pro Pro Arg Gly Val Gln Asp Leu His Val Gly Val 1280 1285 1290 ArgPro Leu Arg Ala Gly Ser Arg Phe Val His Leu Asn Leu Val 1295 1300 1305Asp Val Asp Cys His Gln Leu Val Ala Ser Trp Leu Val Cys Leu 1310 13151320 Cys Cys Arg Gln Pro Leu Ile Ser Lys Ala Phe Glu Ile Met Leu 13251330 1335 Ala Ala Gly Glu Gly Lys Gly Val Asn Lys Arg Ile Thr Tyr Thr1340 1345 1350 Asn Pro Tyr Pro Ser Arg Arg Thr Phe His Leu His Ser AspHis 1355 1360 1365 Pro Glu Leu Leu Arg Phe Arg Glu Asp Ser Phe Gln ValGly Gly 1370 1375 1380 Gly Glu Thr Tyr Thr Ile Gly Leu Gln Phe Ala ProSer Gln Arg 1385 1390 1395 Val Gly Glu Glu Glu Ile Leu Ile Tyr Ile AsnAsp His Glu Asp 1400 1405 1410 Lys Asn Glu Glu Ala Phe Cys Val Lys ValIle Tyr Gln 1415 1420 1425 9 3629 DNA Homo sapiens 9 gacgcgaggcgggttcttgg actgagtgtg cggcgcggtg cgccgccttc cgaggctcct 60 cccgcgggtggcagcggacg gggcgcgccc ctcggccagt cctcggtcct caggcttgtg 120 gctccgttgagcaccggccg ccgggcctct gggtccgtcg agtggagact ctctgaaaag 180 cgtgggctccgtggcctccg gcgcggccgc ggcgggtcgg tctcctagat catccgggaa 240 gcccacgggaccctcaggcg ggcaggatga acgactggca caggatcttc acccaaaacg 300 tgcttgtccctccccaccca cagagagcgc gccagccttg gaaggaatcc acggcattcc 360 agtgtgtcctcaagtggctg gacggaccgg taattaggca gggcgtgctg gaggtactgt 420 cagaggttgaatgccatctg cgagtgtctt tctttgatgt cacctaccgg cacttctttg 480 ggaggacgtggaaaaccaca gtgaagccga cgaagagacc gccgtccagg atcgtcttta 540 atgagcccttgtattttcac acatccctaa accaccctca tatcgtggct gtggtggaag 600 tggtcgctgagggcaagaaa cgggatggga gcctccagac attgtcctgt gggtttggaa 660 ttcttcggatcttcagcaac cagccggact ctcctatctc tgcttcccag gacaaaaggt 720 tgcggctgtaccatggcacc cccagagccc tcctgcaccc gcttctccag gaccccgcag 780 agcaaaacagacacatgacc ctcattgaga actgcagcct gcagtacacg ctgaagccac 840 acccggccctggagcctgcg ttccaccttc ttcctgagaa ccttctggtg tctggtctgc 900 agcagatacctggcctgctt ccagctcatg gagaatccgg cgacgctctc cgaaagcctc 960 gcctccagaagcccatcacg gggcacttgg atgacttatt cttcaccctg tacccctccc 1020 tggagaagtttgaggaagag ctgctggagc tccacgtcca ggaccacttc caggagggat 1080 gtggcccactggacggtggt gccctggaga tcctggagcg gcgcctgcgt gtgggcgtgc 1140 acaatggtctgggcttcgtg cagaggccgc aggtcgttgt actggtgcct gagatggatg 1200 tggccttgacgcgctcagct agcttcagca ggaaagtggt ctcctcttcc aagaccagct 1260 ccgggagccaagctctggtt ttgagaagcc gcctccgcct cccagagatg gtcggccacc 1320 ctgcatttgcggtcatcttc cagctggagt acgtgttcag cagccctgca ggagtggacg 1380 gcaatgcagcttcggtcacc tctctgtcca acctggcatg catgcacatg gtccgctggg 1440 ctgtttggaaccccttgctg gaagctgatt ctggaagggt gaccctgcct ctgcagggtg 1500 ggatccagcccaacccctcg cactgtctgg tctacaaggt accctcagcc agcatgagct 1560 ctgaagaggtgaagcaggtg gagtcgggta cactccggtt ccagttctcg ctgggctcag 1620 aagaacacctggatgcaccc acggagcctg tcagtggccc caaagtggag cggcggcctt 1680 ccaggaaaccacccacgtcc ccttcgagcc cgccagcgcc agtacctcga gttctcgctg 1740 ccccgcagaactcacctgtg ggaccagggt tgtcaatttc ccagctggcg gcctccccgc 1800 ggtccccgactcagcactgc ttggccaggc ctacttcaca gctaccccat ggctctcagg 1860 cctccccggcccaggcacag gagttcccgt tggaggccgg tatctcccac ctggaagccg 1920 acctgagccagacctccctg gtcctggaaa catccattgc cgaacagtta caggagctgc 1980 cgttcacgcctttgcatgcc cctattgttg tgggaaccca gaccaggagc tctgcagggc 2040 agccctcgagagcctccatg gtgctcctgc agtcctccgg ctttcccgag attctggatg 2100 ccaataaacagccagccgag gctgtcagcg ctacagaacc tgtgacgttt aaccctcaga 2160 aggaagaatcagattgtcta caaagcaacg agatggtgct acagtttctt gcctttagca 2220 gagtggcccaggactgccga ggaacatcat ggccaaagac tgtgtatttc accttccagt 2280 tctaccgcttcccacccgca acgacgccac gactgcagct ggtccagctg gatgaggccg 2340 gccagcccagctctggcgcc ctgacccaca tcctcgtgcc tgtgagcaga gatggcacct 2400 ttgatgctgggtctcctggc ttccagctga ggtacatggt gggccctggg ttcctgaagc 2460 caggtgagcggcgctgcttt gcccgctacc tggccgtgca gaccctgcag attgacgtct 2520 gggacggagactccctgctg ctcatcggat ctgctgccgt ccagatgaag catctcctcc 2580 gccaaggccggccggctgtg caggcctccc acgagcttga ggtcgtggca actgaatacg 2640 agcaggacaacatggtggtg agtggagaca tgctggggtt tggccgcgtc aagcccatcg 2700 gcgtccactcggtggtgaag ggccggctgc acctgacttt ggccaacgtg ggtcacccgt 2760 gtgaacagaaagtgagaggt tgtagcacat tgccaccgtc cagatctcgg gtcatctcaa 2820 acgatggagccagccgcttc tctggaggca gcctcctcac gactggaagc tcaaggcgaa 2880 aacacgtggtgcaagcacag aagctggcgg acgtggacag tgagctggct gccatgctac 2940 tgacccatgcccggcagggc aaggggcccc aggacgtcag ccgcgagtcg gatgccaccc 3000 gcaggcgtaagctggagcgg atgaggtctg tgcgcctgca ggaggccggg ggagacttgg 3060 gccggcgcgggacgagcgtg ttggcgcagc agagcgtccg cacacagcac ttgcgggacc 3120 tacaggtcatcgccgcctac cgggaacgca cgaaggccga gagcatcgcc agcctgctga 3180 gcctggccatcaccacggag cacacgctcc acgccacgct gggggtcgcc gagttctttg 3240 agtttgtgcttaagaacccc cacaacacac agcacacggt gactgtggag atcgacaacc 3300 ccgagctcagcgtcatcgtg gacagtcagg agtggaggga cttcaagggt gctgctggcc 3360 tgcacacaccggtggaggag gacatgttcc acctgcgtgg cagcctggcc ccccagctct 3420 acctgcgcccccacgagacc gcccacgtcc ccttcaagtt ccagagcttc tctgcagggc 3480 agctggccatggtgcaggcc tctcctgggt tgagcaacga gaagggcatg gacgccggtc 3540 accttggaagtccagcgcag tgcccactaa acacgccaag gtcttgttcc gagcgagtgg 3600 tggcaagcccatcgccgtgc tctgcctga 3629 10 1121 PRT Homo sapiens misc_feature(1121)..(1121) Xaa can be any naturally occurring amino acid 10 Met AsnAsp Trp His Arg Ile Phe Thr Gln Asn Val Leu Val Pro Pro 1 5 10 15 HisPro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser Thr Ala Phe Gln 20 25 30 CysVal Leu Lys Trp Leu Asp Gly Pro Val Ile Arg Gln Gly Val Leu 35 40 45 GluVal Leu Ser Glu Val Glu Cys His Leu Arg Val Ser Phe Phe Asp 50 55 60 ValThr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys Thr Thr Val Lys 65 70 75 80Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe Asn Glu Pro Leu Tyr 85 90 95Phe His Thr Ser Leu Asn His Pro His Ile Val Ala Val Val Glu Val 100 105110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser Leu Gln Thr Leu Ser Cys 115120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser Asn Gln Pro Asp Ser Pro Ile130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu Arg Leu Tyr His Gly Thr ProArg 145 150 155 160 Ala Leu Leu His Pro Leu Leu Gln Asp Pro Ala Glu GlnAsn Arg His 165 170 175 Met Thr Leu Ile Glu Asn Cys Ser Leu Gln Tyr ThrLeu Lys Pro His 180 185 190 Pro Ala Leu Glu Pro Ala Phe His Leu Leu ProGlu Asn Leu Leu Val 195 200 205 Ser Gly Leu Gln Gln Ile Pro Gly Leu LeuPro Ala His Gly Glu Ser 210 215 220 Gly Asp Ala Leu Arg Lys Pro Arg LeuGln Lys Pro Ile Thr Gly His 225 230 235 240 Leu Asp Asp Leu Phe Phe ThrLeu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255 Glu Glu Leu Leu Glu LeuHis Val Gln Asp His Phe Gln Glu Gly Cys 260 265 270 Gly Pro Leu Asp GlyGly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275 280 285 Val Gly Val HisAsn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val 290 295 300 Val Leu ValPro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser Phe 305 310 315 320 SerArg Lys Val Val Ser Ser Ser Lys Thr Ser Ser Gly Ser Gln Ala 325 330 335Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu Met Val Gly His Pro 340 345350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr Val Phe Ser Ser Pro Ala 355360 365 Gly Val Asp Gly Asn Ala Ala Ser Val Thr Ser Leu Ser Asn Leu Ala370 375 380 Cys Met His Met Val Arg Trp Ala Val Trp Asn Pro Leu Leu GluAla 385 390 395 400 Asp Ser Gly Arg Val Thr Leu Pro Leu Gln Gly Gly IleGln Pro Asn 405 410 415 Pro Ser His Cys Leu Val Tyr Lys Val Pro Ser AlaSer Met Ser Ser 420 425 430 Glu Glu Val Lys Gln Val Glu Ser Gly Thr LeuArg Phe Gln Phe Ser 435 440 445 Leu Gly Ser Glu Glu His Leu Asp Ala ProThr Glu Pro Val Ser Gly 450 455 460 Pro Lys Val Glu Arg Arg Pro Ser ArgLys Pro Pro Thr Ser Pro Ser 465 470 475 480 Ser Pro Pro Ala Pro Val ProArg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495 Pro Val Gly Pro Gly LeuSer Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505 510 Ser Pro Thr Gln HisCys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515 520 525 Gly Ser Gln AlaSer Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala 530 535 540 Gly Ile SerHis Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu Val Leu 545 550 555 560 GluThr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro Phe Thr Pro Leu 565 570 575His Ala Pro Ile Val Val Gly Thr Gln Thr Arg Ser Ser Ala Gly Gln 580 585590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln Ser Ser Gly Phe Pro Glu 595600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala Glu Ala Val Ser Ala Thr Glu610 615 620 Pro Val Thr Phe Asn Pro Gln Lys Glu Glu Ser Asp Cys Leu GlnSer 625 630 635 640 Asn Glu Met Val Leu Gln Phe Leu Ala Phe Ser Arg ValAla Gln Asp 645 650 655 Cys Arg Gly Thr Ser Trp Pro Lys Thr Val Tyr PheThr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro Pro Ala Thr Thr Pro Arg LeuGln Leu Val Gln Leu 675 680 685 Asp Glu Ala Gly Gln Pro Ser Ser Gly AlaLeu Thr His Ile Leu Val 690 695 700 Pro Val Ser Arg Asp Gly Thr Phe AspAla Gly Ser Pro Gly Phe Gln 705 710 715 720 Leu Arg Tyr Met Val Gly ProGly Phe Leu Lys Pro Gly Glu Arg Arg 725 730 735 Cys Phe Ala Arg Tyr LeuAla Val Gln Thr Leu Gln Ile Asp Val Trp 740 745 750 Asp Gly Asp Ser LeuLeu Leu Ile Gly Ser Ala Ala Val Gln Met Lys 755 760 765 His Leu Leu ArgGln Gly Arg Pro Ala Val Gln Ala Ser His Glu Leu 770 775 780 Glu Val ValAla Thr Glu Tyr Glu Gln Asp Asn Met Val Val Ser Gly 785 790 795 800 AspMet Leu Gly Phe Gly Arg Val Lys Pro Ile Gly Val His Ser Val 805 810 815Val Lys Gly Arg Leu His Leu Thr Leu Ala Asn Val Gly His Pro Cys 820 825830 Glu Gln Lys Val Arg Gly Cys Ser Thr Leu Pro Pro Ser Arg Ser Arg 835840 845 Val Ile Ser Asn Asp Gly Ala Ser Arg Phe Ser Gly Gly Ser Leu Leu850 855 860 Thr Thr Gly Ser Ser Arg Arg Lys His Val Val Gln Ala Gln LysLeu 865 870 875 880 Ala Asp Val Asp Ser Glu Leu Ala Ala Met Leu Leu ThrHis Ala Arg 885 890 895 Gln Gly Lys Gly Pro Gln Asp Val Ser Arg Glu SerAsp Ala Thr Arg 900 905 910 Arg Arg Lys Leu Glu Arg Met Arg Ser Val ArgLeu Gln Glu Ala Gly 915 920 925 Gly Asp Leu Gly Arg Arg Gly Thr Ser ValLeu Ala Gln Gln Ser Val 930 935 940 Arg Thr Gln His Leu Arg Asp Leu GlnVal Ile Ala Ala Tyr Arg Glu 945 950 955 960 Arg Thr Lys Ala Glu Ser IleAla Ser Leu Leu Ser Leu Ala Ile Thr 965 970 975 Thr Glu His Thr Leu HisAla Thr Leu Gly Val Ala Glu Phe Phe Glu 980 985 990 Phe Val Leu Lys AsnPro His Asn Thr Gln His Thr Val Thr Val Glu 995 1000 1005 Ile Asp AsnPro Glu Leu Ser Val Ile Val Asp Ser Gln Glu Trp 1010 1015 1020 Arg AspPhe Lys Gly Ala Ala Gly Leu His Thr Pro Val Glu Glu 1025 1030 1035 AspMet Phe His Leu Arg Gly Ser Leu Ala Pro Gln Leu Tyr Leu 1040 1045 1050Arg Pro His Glu Thr Ala His Val Pro Phe Lys Phe Gln Ser Phe 1055 10601065 Ser Ala Gly Gln Leu Ala Met Val Gln Ala Ser Pro Gly Leu Ser 10701075 1080 Asn Glu Lys Gly Met Asp Ala Gly His Leu Gly Ser Pro Ala Gln1085 1090 1095 Cys Pro Leu Asn Thr Pro Arg Ser Cys Ser Glu Arg Val ValAla 1100 1105 1110 Ser Pro Ser Pro Cys Ser Ala Xaa 1115 1120 11 1601 DNAHomo sapiens 11 gacgcgaggc gggttcttgg actgagtgtg cggcgcggtg cgccgccttccgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgccc ctcggccagt cctcggtcctcaggcttgtg 120 gctccgttga gcaccggccg ccgggcctct gggtccgtcg agtggagactctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcgg tctcctagatcatccgggaa 240 gcccacggga ccctcaggcg ggcaggatga acgactggca caggatcttcacccaaaacg 300 tgcttgtccc tccccaccca cagagagcgc gccagccttg gaaggaatccacggcattcc 360 agtgtgtcct caagtggctg gacggaccgg taattaggca gggcgtgctggaggtactgt 420 cagaggttga atgccatctg cgagtgtctt tctttgatgt cacctaccggcacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccga cgaagagacc gccgtccaggatcgtcttta 540 atgagccctt gtattttcac acatccctaa accaccctca tatcgtggctgtggtggaag 600 tggtcgctga gggcaagaaa cgggatggga gcctccagac attgtcctgtgggtttggaa 660 ttcttcggat cttcagcaac cagccggact ctcctatctc tgcttcccaggacaaaaggt 720 tgcggctgta ccatggcacc cccagagccc tcctgcaccc gcttctccaggaccccgcag 780 agcaaaacag acacatgacc ctcattgaga actgcagcct gcagtacacgctgaagccac 840 acccggccct ggagcctgcg ttccaccttc ttcctgagaa ccttctggtgtctggtctgc 900 agcagatacc tggcctgctt ccagctcatg gagaatccgg cgacgctctccgaaagcctc 960 gcctccagaa gcccatcacg gggcacttgg atgacttatt cttcaccctgtacccctccc 1020 tggagaagtt tgaggaagag ctgctggagc tccacgtcca ggaccacttccaggagggat 1080 gtggcccact ggacggtggt gccctggaga tcctggagcg gcgcctgcgtgtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgc aggtcgttgt actggtgcctgagatggatg 1200 tggccttgac gcgctcagct agcttcagca ggaaagtggt ctcctcttccaagaccagct 1260 ccgggagcca agctctggtt ttgagaagcc gcctccgcct cccagagatggtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagt acgtgttcag cagccctgcaggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtcca acctggcatg catgcacatggtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgatt ctggaagggt gaccctgcctctgcagggtg 1500 ggatccagcc caacccctcg cactgtctgg tctacaaggt accctcagccagcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggta cactccggta a 1601 12445 PRT Homo sapiens misc_feature (445)..(445) Xaa can be any naturallyoccurring amino acid 12 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn ValLeu Val Pro Pro 1 5 10 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys GluSer Thr Ala Phe Gln 20 25 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val IleArg Gln Gly Val Leu 35 40 45 Glu Val Leu Ser Glu Val Glu Cys His Leu ArgVal Ser Phe Phe Asp 50 55 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr TrpLys Thr Thr Val Lys 65 70 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile ValPhe Asn Glu Pro Leu Tyr 85 90 95 Phe His Thr Ser Leu Asn His Pro His IleVal Ala Val Val Glu Val 100 105 110 Val Ala Glu Gly Lys Lys Arg Asp GlySer Leu Gln Thr Leu Ser Cys 115 120 125 Gly Phe Gly Ile Leu Arg Ile PheSer Asn Gln Pro Asp Ser Pro Ile 130 135 140 Ser Ala Ser Gln Asp Lys ArgLeu Arg Leu Tyr His Gly Thr Pro Arg 145 150 155 160 Ala Leu Leu His ProLeu Leu Gln Asp Pro Ala Glu Gln Asn Arg His 165 170 175 Met Thr Leu IleGlu Asn Cys Ser Leu Gln Tyr Thr Leu Lys Pro His 180 185 190 Pro Ala LeuGlu Pro Ala Phe His Leu Leu Pro Glu Asn Leu Leu Val 195 200 205 Ser GlyLeu Gln Gln Ile Pro Gly Leu Leu Pro Ala His Gly Glu Ser 210 215 220 GlyAsp Ala Leu Arg Lys Pro Arg Leu Gln Lys Pro Ile Thr Gly His 225 230 235240 Leu Asp Asp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245250 255 Glu Glu Leu Leu Glu Leu His Val Gln Asp His Phe Gln Glu Gly Cys260 265 270 Gly Pro Leu Asp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg LeuArg 275 280 285 Val Gly Val His Asn Gly Leu Gly Phe Val Gln Arg Pro GlnVal Val 290 295 300 Val Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg SerAla Ser Phe 305 310 315 320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr SerSer Gly Ser Gln Ala 325 330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu ProGlu Met Val Gly His Pro 340 345 350 Ala Phe Ala Val Ile Phe Gln Leu GluTyr Val Phe Ser Ser Pro Ala 355 360 365 Gly Val Asp Gly Asn Ala Ala SerVal Thr Ser Leu Ser Asn Leu Ala 370 375 380 Cys Met His Met Val Arg TrpAla Val Trp Asn Pro Leu Leu Glu Ala 385 390 395 400 Asp Ser Gly Arg ValThr Leu Pro Leu Gln Gly Gly Ile Gln Pro Asn 405 410 415 Pro Ser His CysLeu Val Tyr Lys Val Pro Ser Ala Ser Met Ser Ser 420 425 430 Glu Glu ValLys Gln Val Glu Ser Gly Thr Leu Arg Xaa 435 440 445 13 2240 DNA Homosapiens 13 gacgcgaggc gggttcttgg actgagtgtg cggcgcggtg cgccgccttccgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgccc ctcggccagt cctcggtcctcaggcttgtg 120 gctccgttga gcaccggccg ccgggcctct gggtccgtcg agtggagactctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcgg tctcctagatcatccgggaa 240 gcccacggga ccctcaggcg ggcaggatga acgactggca caggatcttcacccaaaacg 300 tgcttgtccc tccccaccca cagagagcgc gccagccttg gaaggaatccacggcattcc 360 agtgtgtcct caagtggctg gacggaccgg taattaggca gggcgtgctggaggtactgt 420 cagaggttga atgccatctg cgagtgtctt tctttgatgt cacctaccggcacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccga cgaagagacc gccgtccaggatcgtcttta 540 atgagccctt gtattttcac acatccctaa accaccctca tatcgtggctgtggtggaag 600 tggtcgctga gggcaagaaa cgggatggga gcctccagac attgtcctgtgggtttggaa 660 ttcttcggat cttcagcaac cagccggact ctcctatctc tgcttcccaggacaaaaggt 720 tgcggctgta ccatggcacc cccagagccc tcctgcaccc gcttctccaggaccccgcag 780 agcaaaacag acacatgacc ctcattgaga actgcagcct gcagtacacgctgaagccac 840 acccggccct ggagcctgcg ttccaccttc ttcctgagaa ccttctggtgtctggtctgc 900 agcagatacc tggcctgctt ccagctcatg gagaatccgg cgacgctctccgaaagcctc 960 gcctccagaa gcccatcacg gggcacttgg atgacttatt cttcaccctgtacccctccc 1020 tggagaagtt tgaggaagag ctgctggagc tccacgtcca ggaccacttccaggagggat 1080 gtggcccact ggacggtggt gccctggaga tcctggagcg gcgcctgcgtgtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgc aggtcgttgt actggtgcctgagatggatg 1200 tggccttgac gcgctcagct agcttcagca ggaaagtggt ctcctcttccaagaccagct 1260 ccgggagcca agctctggtt ttgagaagcc gcctccgcct cccagagatggtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagt acgtgttcag cagccctgcaggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtcca acctggcatg catgcacatggtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgatt ctggaagggt gaccctgcctctgcagggtg 1500 ggatccagcc caacccctcg cactgtctgg tctacaaggt accctcagccagcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggta cactccggtt ccagttctcgctgggctcag 1620 aagaacacct ggatgcaccc acggagcctg tcagtggccc caaagtggagcggcggcctt 1680 ccaggaaacc acccacgtcc ccttcgagcc cgccagcgcc agtacctcgagttctcgctg 1740 ccccgcagaa ctcacctgtg ggaccagggt tgtcaatttc ccagctggcggcctccccgc 1800 ggtccccgac tcagcactgc ttggccaggc ctacttcaca gctaccccatggctctcagg 1860 cctccccggc ccaggcacag gagttcccgt tggaggccgg tatctcccacctggaagccg 1920 acctgagcca gacctccctg gtcctggaaa catccattgc cgaacagttacaggagctgc 1980 cgttcacgcc tttgcatgcc cctattgttg tgggaaccca gaccaggagctctgcagggc 2040 agccctcgag agcctccatg gtgctcctgc agtcctccgg ctttcccgagattctggatg 2100 ccaataaaca gccagccgag gctgtcagcg ctacagaacc tgtgacgtttaaccctcaga 2160 aggaagaatc agattgtcta caaagcaacg agatggtgct acagtttcttgcctttagca 2220 gagtggccca ggactgctga 2240 14 658 PRT Homo sapiensmisc_feature (658)..(658) Xaa can be any naturally occurring amino acid14 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn Val Leu Val Pro Pro 1 510 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser Thr Ala Phe Gln 2025 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val Ile Arg Gln Gly Val Leu 3540 45 Glu Val Leu Ser Glu Val Glu Cys His Leu Arg Val Ser Phe Phe Asp 5055 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys Thr Thr Val Lys 6570 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe Asn Glu Pro Leu Tyr85 90 95 Phe His Thr Ser Leu Asn His Pro His Ile Val Ala Val Val Glu Val100 105 110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser Leu Gln Thr Leu SerCys 115 120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser Asn Gln Pro Asp SerPro Ile 130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu Arg Leu Tyr His GlyThr Pro Arg 145 150 155 160 Ala Leu Leu His Pro Leu Leu Gln Asp Pro AlaGlu Gln Asn Arg His 165 170 175 Met Thr Leu Ile Glu Asn Cys Ser Leu GlnTyr Thr Leu Lys Pro His 180 185 190 Pro Ala Leu Glu Pro Ala Phe His LeuLeu Pro Glu Asn Leu Leu Val 195 200 205 Ser Gly Leu Gln Gln Ile Pro GlyLeu Leu Pro Ala His Gly Glu Ser 210 215 220 Gly Asp Ala Leu Arg Lys ProArg Leu Gln Lys Pro Ile Thr Gly His 225 230 235 240 Leu Asp Asp Leu PhePhe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255 Glu Glu Leu LeuGlu Leu His Val Gln Asp His Phe Gln Glu Gly Cys 260 265 270 Gly Pro LeuAsp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275 280 285 Val GlyVal His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val 290 295 300 ValLeu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser Phe 305 310 315320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr Ser Ser Gly Ser Gln Ala 325330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu Met Val Gly His Pro340 345 350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr Val Phe Ser Ser ProAla 355 360 365 Gly Val Asp Gly Asn Ala Ala Ser Val Thr Ser Leu Ser AsnLeu Ala 370 375 380 Cys Met His Met Val Arg Trp Ala Val Trp Asn Pro LeuLeu Glu Ala 385 390 395 400 Asp Ser Gly Arg Val Thr Leu Pro Leu Gln GlyGly Ile Gln Pro Asn 405 410 415 Pro Ser His Cys Leu Val Tyr Lys Val ProSer Ala Ser Met Ser Ser 420 425 430 Glu Glu Val Lys Gln Val Glu Ser GlyThr Leu Arg Phe Gln Phe Ser 435 440 445 Leu Gly Ser Glu Glu His Leu AspAla Pro Thr Glu Pro Val Ser Gly 450 455 460 Pro Lys Val Glu Arg Arg ProSer Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475 480 Ser Pro Pro Ala ProVal Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495 Pro Val Gly ProGly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505 510 Ser Pro ThrGln His Cys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515 520 525 Gly SerGln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala 530 535 540 GlyIle Ser His Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu Val Leu 545 550 555560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro Phe Thr Pro Leu 565570 575 His Ala Pro Ile Val Val Gly Thr Gln Thr Arg Ser Ser Ala Gly Gln580 585 590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln Ser Ser Gly Phe ProGlu 595 600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala Glu Ala Val Ser AlaThr Glu 610 615 620 Pro Val Thr Phe Asn Pro Gln Lys Glu Glu Ser Asp CysLeu Gln Ser 625 630 635 640 Asn Glu Met Val Leu Gln Phe Leu Ala Phe SerArg Val Ala Gln Asp 645 650 655 Cys Xaa 15 2312 DNA Homo sapiens 15gacgcgaggc gggttcttgg actgagtgtg cggcgcggtg cgccgccttc cgaggctcct 60cccgcgggtg gcagcggacg gggcgcgccc ctcggccagt cctcggtcct caggcttgtg 120gctccgttga gcaccggccg ccgggcctct gggtccgtcg agtggagact ctctgaaaag 180cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcgg tctcctagat catccgggaa 240gcccacggga ccctcaggcg ggcaggatga acgactggca caggatcttc acccaaaacg 300tgcttgtccc tccccaccca cagagagcgc gccagccttg gaaggaatcc acggcattcc 360agtgtgtcct caagtggctg gacggaccgg taattaggca gggcgtgctg gaggtactgt 420cagaggttga atgccatctg cgagtgtctt tctttgatgt cacctaccgg cacttctttg 480ggaggacgtg gaaaaccaca gtgaagccga cgaagagacc gccgtccagg atcgtcttta 540atgagccctt gtattttcac acatccctaa accaccctca tatcgtggct gtggtggaag 600tggtcgctga gggcaagaaa cgggatggga gcctccagac attgtcctgt gggtttggaa 660ttcttcggat cttcagcaac cagccggact ctcctatctc tgcttcccag gacaaaaggt 720tgcggctgta ccatggcacc cccagagccc tcctgcaccc gcttctccag gaccccgcag 780agcaaaacag acacatgacc ctcattgaga actgcagcct gcagtacacg ctgaagccac 840acccggccct ggagcctgcg ttccaccttc ttcctgagaa ccttctggtg tctggtctgc 900agcagatacc tggcctgctt ccagctcatg gagaatccgg cgacgctctc cgaaagcctc 960gcctccagaa gcccatcacg gggcacttgg atgacttatt cttcaccctg tacccctccc 1020tggagaagtt tgaggaagag ctgctggagc tccacgtcca ggaccacttc caggagggat 1080gtggcccact ggacggtggt gccctggaga tcctggagcg gcgcctgcgt gtgggcgtgc 1140acaatggtct gggcttcgtg cagaggccgc aggtcgttgt actggtgcct gagatggatg 1200tggccttgac gcgctcagct agcttcagca ggaaagtggt ctcctcttcc aagaccagct 1260ccgggagcca agctctggtt ttgagaagcc gcctccgcct cccagagatg gtcggccacc 1320ctgcatttgc ggtcatcttc cagctggagt acgtgttcag cagccctgca ggagtggacg 1380gcaatgcagc ttcggtcacc tctctgtcca acctggcatg catgcacatg gtccgctggg 1440ctgtttggaa ccccttgctg gaagctgatt ctggaagggt gaccctgcct ctgcagggtg 1500ggatccagcc caacccctcg cactgtctgg tctacaaggt accctcagcc agcatgagct 1560ctgaagaggt gaagcaggtg gagtcgggta cactccggtt ccagttctcg ctgggctcag 1620aagaacacct ggatgcaccc acggagcctg tcagtggccc caaagtggag cggcggcctt 1680ccaggaaacc acccacgtcc ccttcgagcc cgccagcgcc agtacctcga gttctcgctg 1740ccccgcagaa ctcacctgtg ggaccagggt tgtcaatttc ccagctggcg gcctccccgc 1800ggtccccgac tcagcactgc ttggccaggc ctacttcaca gctaccccat ggctctcagg 1860cctccccggc ccaggcacag gagttcccgt tggaggccgg tatctcccac ctggaagccg 1920acctgagcca gacctccctg gtcctggaaa catccattgc cgaacagtta caggagctgc 1980cgttcacgcc tttgcatgcc cctattgttg tgggaaccca gaccaggagc tctgcagggc 2040agccctcgag agcctccatg gtgctcctgc agtcctccgg ctttcccgag attctggatg 2100ccaataaaca gccagccgag gctgtcagcg ctacagaacc tgtgacgttt aaccctcaga 2160aggaagaatc agattgtcta caaagcaacg agatggtgct acagtttctt gcctttagca 2220gagtggccca ggactgccga ggaacatcat ggccaaagac tgtgtatttc accttccagt 2280tctaccgctt cccacccgca acgacgccat ga 2312 16 682 PRT Homo sapiensmisc_feature (682)..(682) Xaa can be any naturally occurring amino acid16 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn Val Leu Val Pro Pro 1 510 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser Thr Ala Phe Gln 2025 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val Ile Arg Gln Gly Val Leu 3540 45 Glu Val Leu Ser Glu Val Glu Cys His Leu Arg Val Ser Phe Phe Asp 5055 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys Thr Thr Val Lys 6570 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe Asn Glu Pro Leu Tyr85 90 95 Phe His Thr Ser Leu Asn His Pro His Ile Val Ala Val Val Glu Val100 105 110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser Leu Gln Thr Leu SerCys 115 120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser Asn Gln Pro Asp SerPro Ile 130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu Arg Leu Tyr His GlyThr Pro Arg 145 150 155 160 Ala Leu Leu His Pro Leu Leu Gln Asp Pro AlaGlu Gln Asn Arg His 165 170 175 Met Thr Leu Ile Glu Asn Cys Ser Leu GlnTyr Thr Leu Lys Pro His 180 185 190 Pro Ala Leu Glu Pro Ala Phe His LeuLeu Pro Glu Asn Leu Leu Val 195 200 205 Ser Gly Leu Gln Gln Ile Pro GlyLeu Leu Pro Ala His Gly Glu Ser 210 215 220 Gly Asp Ala Leu Arg Lys ProArg Leu Gln Lys Pro Ile Thr Gly His 225 230 235 240 Leu Asp Asp Leu PhePhe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255 Glu Glu Leu LeuGlu Leu His Val Gln Asp His Phe Gln Glu Gly Cys 260 265 270 Gly Pro LeuAsp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275 280 285 Val GlyVal His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val 290 295 300 ValLeu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala Ser Phe 305 310 315320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr Ser Ser Gly Ser Gln Ala 325330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu Met Val Gly His Pro340 345 350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr Val Phe Ser Ser ProAla 355 360 365 Gly Val Asp Gly Asn Ala Ala Ser Val Thr Ser Leu Ser AsnLeu Ala 370 375 380 Cys Met His Met Val Arg Trp Ala Val Trp Asn Pro LeuLeu Glu Ala 385 390 395 400 Asp Ser Gly Arg Val Thr Leu Pro Leu Gln GlyGly Ile Gln Pro Asn 405 410 415 Pro Ser His Cys Leu Val Tyr Lys Val ProSer Ala Ser Met Ser Ser 420 425 430 Glu Glu Val Lys Gln Val Glu Ser GlyThr Leu Arg Phe Gln Phe Ser 435 440 445 Leu Gly Ser Glu Glu His Leu AspAla Pro Thr Glu Pro Val Ser Gly 450 455 460 Pro Lys Val Glu Arg Arg ProSer Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475 480 Ser Pro Pro Ala ProVal Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495 Pro Val Gly ProGly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505 510 Ser Pro ThrGln His Cys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515 520 525 Gly SerGln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala 530 535 540 GlyIle Ser His Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu Val Leu 545 550 555560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro Phe Thr Pro Leu 565570 575 His Ala Pro Ile Val Val Gly Thr Gln Thr Arg Ser Ser Ala Gly Gln580 585 590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln Ser Ser Gly Phe ProGlu 595 600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala Glu Ala Val Ser AlaThr Glu 610 615 620 Pro Val Thr Phe Asn Pro Gln Lys Glu Glu Ser Asp CysLeu Gln Ser 625 630 635 640 Asn Glu Met Val Leu Gln Phe Leu Ala Phe SerArg Val Ala Gln Asp 645 650 655 Cys Arg Gly Thr Ser Trp Pro Lys Thr ValTyr Phe Thr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro Pro Ala Thr Thr ProXaa 675 680 17 4994 DNA Homo sapiens 17 gacgcgaggc gggttcttgg actgagtgtgcggcgcggtg cgccgccttc cgaggctcct 60 cccgcgggtg gcagcggacg gggcgcgcccctcggccagt cctcggtcct caggcttgtg 120 gctccgttga gcaccggccg ccgggcctctgggtccgtcg agtggagact ctctgaaaag 180 cgtgggctcc gtggcctccg gcgcggccgcggcgggtcgg tctcctagat catccgggaa 240 gcccacggga ccctcaggcg ggcaggatgaacgactggca caggatcttc acccaaaacg 300 tgcttgtccc tccccaccca cagagagcgcgccagccttg gaaggaatcc acggcattcc 360 agtgtgtcct caagtggctg gacggaccggtaattaggca gggcgtgctg gaggtactgt 420 cagaggttga atgccatctg cgagtgtctttctttgatgt cacctaccgg cacttctttg 480 ggaggacgtg gaaaaccaca gtgaagccgacgaagagacc gccgtccagg atcgtcttta 540 atgagccctt gtattttcac acatccctaaaccaccctca tatcgtggct gtggtggaag 600 tggtcgctga gggcaagaaa cgggatgggagcctccagac attgtcctgt gggtttggaa 660 ttcttcggat cttcagcaac cagccggactctcctatctc tgcttcccag gacaaaaggt 720 tgcggctgta ccatggcacc cccagagccctcctgcaccc gcttctccag gaccccgcag 780 agcaaaacag acacatgacc ctcattgagaactgcagcct gcagtacacg ctgaagccac 840 acccggccct ggagcctgcg ttccaccttcttcctgagaa ccttctggtg tctggtctgc 900 agcagatacc tggcctgctt ccagctcatggagaatccgg cgacgctctc cgaaagcctc 960 gcctccagaa gcccatcacg gggcacttggatgacttatt cttcaccctg tacccctccc 1020 tggagaagtt tgaggaagag ctgctggagctccacgtcca ggaccacttc caggagggat 1080 gtggcccact ggacggtggt gccctggagatcctggagcg gcgcctgcgt gtgggcgtgc 1140 acaatggtct gggcttcgtg cagaggccgcaggtcgttgt actggtgcct gagatggatg 1200 tggccttgac gcgctcagct agcttcagcaggaaagtggt ctcctcttcc aagaccagct 1260 ccgggagcca agctctggtt ttgagaagccgcctccgcct cccagagatg gtcggccacc 1320 ctgcatttgc ggtcatcttc cagctggagtacgtgttcag cagccctgca ggagtggacg 1380 gcaatgcagc ttcggtcacc tctctgtccaacctggcatg catgcacatg gtccgctggg 1440 ctgtttggaa ccccttgctg gaagctgattctggaagggt gaccctgcct ctgcagggtg 1500 ggatccagcc caacccctcg cactgtctggtctacaaggt accctcagcc agcatgagct 1560 ctgaagaggt gaagcaggtg gagtcgggtacactccggtt ccagttctcg ctgggctcag 1620 aagaacacct ggatgcaccc acggagcctgtcagtggccc caaagtggag cggcggcctt 1680 ccaggaaacc acccacgtcc ccttcgagcccgccagcgcc agtacctcga gttctcgctg 1740 ccccgcagaa ctcacctgtg ggaccagggttgtcaatttc ccagctggcg gcctccccgc 1800 ggtccccgac tcagcactgc ttggccaggcctacttcaca gctaccccat ggctctcagg 1860 cctccccggc ccaggcacag gagttcccgttggaggccgg tatctcccac ctggaagccg 1920 acctgagcca gacctccctg gtcctggaaacatccattgc cgaacagtta caggagctgc 1980 cgttcacgcc tttgcatgcc cctattgttgtgggaaccca gaccaggagc tctgcagggc 2040 agccctcgag agcctccatg gtgctcctgcagtcctccgg ctttcccgag attctggatg 2100 ccaataaaca gccagccgag gctgtcagcgctacagaacc tgtgacgttt aaccctcaga 2160 aggaagaatc agattgtcta caaagcaacgagatggtgct acagtttctt gcctttagca 2220 gagtggccca ggactgccga ggaacatcatggccaaagac tgtgtatttc accttccagt 2280 tctaccgctt cccacccgca acgacgccacgactgcagct ggtccagctg gatgaggccg 2340 gccagcccag ctctggcgcc ctgacccacatcctcgtgcc tgtgagcaga gatggcacct 2400 ttgatgctgg gtctcctggc ttccagctgaggtacatggt gggccctggg ttcctgaagc 2460 caggtgagcg gcgctgcttt gcccgctacctggccgtgca gaccctgcag attgacgtct 2520 gggacggaga ctccctgctg ctcatcggatctgctgccgt ccagatgaag catctcctcc 2580 gccaaggccg gccggctgtg caggcctcccacgagcttga ggtcgtggca actgaatacg 2640 agcaggacaa catggtggtg agtggagacatgctggggtt tggccgcgtc aagcccatcg 2700 gcgtccactc ggtggtgaag ggccggctgcacctgacttt ggccaacgtg ggtcacccgt 2760 gtgaacagaa agtgagaggt tgtagcacattgccaccgtc cagatcttgg gtcatctcaa 2820 acgatggagc cagccgcttc tctggaggcagcctcctcac gactggaagc tcaaggcgaa 2880 aacacgtggt gcaagcacag aagctggcggacgtggacag tgagctggct gccatgctac 2940 tgacccatgc ccggcagggc aaggggccccaggacgtcag ccgcgagtcg gatgccaccc 3000 gcaggcgtaa gctggagcgg atgaggtctgtgcgcctgca ggaggccggg ggagacttgg 3060 gccggcgcgg gacgagcgtg ttggcgcagcagagcgtccg cacacagcac ttgcgggacc 3120 tacaggtcat cgccgcctac cgggaacgcacgaaggccga gagcatcgcc agcctgctga 3180 gcctggccat caccacggag cacacgctccacgccacgct gggggtcgcc gagttctttg 3240 agtttgtgct taagaacccc cacaacacacagcacacggt gactgtggag atcgacaacc 3300 ccgagctcag cgtcatcgtg gacagtcaggagtggaggga cttcaagggt gctgctggcc 3360 tgcacacacc ggtggaggag gacatgttccacctgcgtgg cagcctggcc ccccagctct 3420 acctgcgccc ccacgagacc gcccacgtccccttcaagtt ccagagcttc tctgcagggc 3480 agctggccat ggtgcaggcc tctcctgggttgagcaacga gaagggcatg gacgccgtgt 3540 caccttggaa gtccagcgca gtgcccactaaacacgccaa ggtcttgttc cgagcgagtg 3600 gtggcaagcc catcgccgtg ctctgcctgactgtggagct gcagccccac gtggtggacc 3660 aggtcttccg cttctatcac ccggagctctccttcctgaa gaaggccatc cgcctgccgc 3720 cctggcacac atttccaggt gctccggtgggaatgcttgg tgaggacccc ccagtccatg 3780 ttcgctgcag cgacccgaac gtcatctgtgagacccagaa tgtgggcccc ggggaaccac 3840 gggacatatt tctgaaggtg gccagtggtccaagcccgga gatcaaagac ttctttgtca 3900 tcatttactc ggatcgctgg ctggcgacacccacacagac gtggcaggtc tacctccact 3960 ccctgcagcg cgtggatgtc tcctgcgtcgcaggccagct gacccgcctg tcccttgtcc 4020 ttcgggggac acagacagtg aggaaagtgagagctttcac ctctcatccc caggagctga 4080 agacagaccc caaaggtgtc ttcgtgctgccgcctcgtgg ggtgcaggac ctgcatgttg 4140 gcgtgaggcc ccttagggcc ggcagccgctttgtccatct caacctggtg gacgtggatt 4200 gccaccagct ggtggcctcc tggctcgtgtgcctctgctg ccgccagccg ctcatctcca 4260 aggcctttga gatcatgttg gctgcgggcgaagggaaggg tgtcaacaag aggatcacct 4320 acaccaaccc ctacccctcc cggaggacattccacctgca cagcgaccac ccggagctgc 4380 tgcggttcag agaggactcc ttccaggtcgggggtggaga gacctacacc atcggcttgc 4440 agtttgcgcc tagtcagaga gtgggtgaggaggagatcct gatctacatc aatgaccatg 4500 aggacaaaaa cgaagaggca ttttgcgtgaaggtcatcta ccagtgaggg cttgagggtg 4560 acgtccttcc tgcggcaccc agctggggcctgtctgtgcc cctcctgccc tgcaggctgt 4620 cctccccgcc tctctgcagc ctttcacttcagtgcccacc tggctgacct gtgcacttgg 4680 ctgaggaagc agagaccgag cgctggtcattttgtagtac ctgcatccag cttagctgct 4740 gctgacaccc agcaggcctg ggttccgtgagcgcgaactc cgtggtggtg ggtctggctc 4800 tggtgctgcc atctacgcat gtgggaccctcgttatcgct gttgctcaaa atgtatttta 4860 tgaatcatcc taaatgagaa aattatgtttttcttactgg attttgtaca aacataatct 4920 attatttgct atgcaatatt ttatgctggtattatatctg ttttttaaat tgttgaacaa 4980 aatactaaac tttt 4994 18 1426 PRTHomo sapiens 18 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn Val Leu ValPro Pro 1 5 10 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys Glu Ser ThrAla Phe Gln 20 25 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val Ile Arg GlnGly Val Leu 35 40 45 Glu Val Leu Ser Glu Val Glu Cys His Leu Arg Val SerPhe Phe Asp 50 55 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr Trp Lys ThrThr Val Lys 65 70 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile Val Phe AsnGlu Pro Leu Tyr 85 90 95 Phe His Thr Ser Leu Asn His Pro His Ile Val AlaVal Val Glu Val 100 105 110 Val Ala Glu Gly Lys Lys Arg Asp Gly Ser LeuGln Thr Leu Ser Cys 115 120 125 Gly Phe Gly Ile Leu Arg Ile Phe Ser AsnGln Pro Asp Ser Pro Ile 130 135 140 Ser Ala Ser Gln Asp Lys Arg Leu ArgLeu Tyr His Gly Thr Pro Arg 145 150 155 160 Ala Leu Leu His Pro Leu LeuGln Asp Pro Ala Glu Gln Asn Arg His 165 170 175 Met Thr Leu Ile Glu AsnCys Ser Leu Gln Tyr Thr Leu Lys Pro His 180 185 190 Pro Ala Leu Glu ProAla Phe His Leu Leu Pro Glu Asn Leu Leu Val 195 200 205 Ser Gly Leu GlnGln Ile Pro Gly Leu Leu Pro Ala His Gly Glu Ser 210 215 220 Gly Asp AlaLeu Arg Lys Pro Arg Leu Gln Lys Pro Ile Thr Gly His 225 230 235 240 LeuAsp Asp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245 250 255Glu Glu Leu Leu Glu Leu His Val Gln Asp His Phe Gln Glu Gly Cys 260 265270 Gly Pro Leu Asp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg Leu Arg 275280 285 Val Gly Val His Asn Gly Leu Gly Phe Val Gln Arg Pro Gln Val Val290 295 300 Val Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg Ser Ala SerPhe 305 310 315 320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr Ser Ser GlySer Gln Ala 325 330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu Pro Glu MetVal Gly His Pro 340 345 350 Ala Phe Ala Val Ile Phe Gln Leu Glu Tyr ValPhe Ser Ser Pro Ala 355 360 365 Gly Val Asp Gly Asn Ala Ala Ser Val ThrSer Leu Ser Asn Leu Ala 370 375 380 Cys Met His Met Val Arg Trp Ala ValTrp Asn Pro Leu Leu Glu Ala 385 390 395 400 Asp Ser Gly Arg Val Thr LeuPro Leu Gln Gly Gly Ile Gln Pro Asn 405 410 415 Pro Ser His Cys Leu ValTyr Lys Val Pro Ser Ala Ser Met Ser Ser 420 425 430 Glu Glu Val Lys GlnVal Glu Ser Gly Thr Leu Arg Phe Gln Phe Ser 435 440 445 Leu Gly Ser GluGlu His Leu Asp Ala Pro Thr Glu Pro Val Ser Gly 450 455 460 Pro Lys ValGlu Arg Arg Pro Ser Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475 480 SerPro Pro Ala Pro Val Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485 490 495Pro Val Gly Pro Gly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg 500 505510 Ser Pro Thr Gln His Cys Leu Ala Arg Pro Thr Ser Gln Leu Pro His 515520 525 Gly Ser Gln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro Leu Glu Ala530 535 540 Gly Ile Ser His Leu Glu Ala Asp Leu Ser Gln Thr Ser Leu ValLeu 545 550 555 560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu Leu Pro PheThr Pro Leu 565 570 575 His Ala Pro Ile Val Val Gly Thr Gln Thr Arg SerSer Ala Gly Gln 580 585 590 Pro Ser Arg Ala Ser Met Val Leu Leu Gln SerSer Gly Phe Pro Glu 595 600 605 Ile Leu Asp Ala Asn Lys Gln Pro Ala GluAla Val Ser Ala Thr Glu 610 615 620 Pro Val Thr Phe Asn Pro Gln Lys GluGlu Ser Asp Cys Leu Gln Ser 625 630 635 640 Asn Glu Met Val Leu Gln PheLeu Ala Phe Ser Arg Val Ala Gln Asp 645 650 655 Cys Arg Gly Thr Ser TrpPro Lys Thr Val Tyr Phe Thr Phe Gln Phe 660 665 670 Tyr Arg Phe Pro ProAla Thr Thr Pro Arg Leu Gln Leu Val Gln Leu 675 680 685 Asp Glu Ala GlyGln Pro Ser Ser Gly Ala Leu Thr His Ile Leu Val 690 695 700 Pro Val SerArg Asp Gly Thr Phe Asp Ala Gly Ser Pro Gly Phe Gln 705 710 715 720 LeuArg Tyr Met Val Gly Pro Gly Phe Leu Lys Pro Gly Glu Arg Arg 725 730 735Cys Phe Ala Arg Tyr Leu Ala Val Gln Thr Leu Gln Ile Asp Val Trp 740 745750 Asp Gly Asp Ser Leu Leu Leu Ile Gly Ser Ala Ala Val Gln Met Lys 755760 765 His Leu Leu Arg Gln Gly Arg Pro Ala Val Gln Ala Ser His Glu Leu770 775 780 Glu Val Val Ala Thr Glu Tyr Glu Gln Asp Asn Met Val Val SerGly 785 790 795 800 Asp Met Leu Gly Phe Gly Arg Val Lys Pro Ile Gly ValHis Ser Val 805 810 815 Val Lys Gly Arg Leu His Leu Thr Leu Ala Asn ValGly His Pro Cys 820 825 830 Glu Gln Lys Val Arg Gly Cys Ser Thr Leu ProPro Ser Arg Ser Trp 835 840 845 Val Ile Ser Asn Asp Gly Ala Ser Arg PheSer Gly Gly Ser Leu Leu 850 855 860 Thr Thr Gly Ser Ser Arg Arg Lys HisVal Val Gln Ala Gln Lys Leu 865 870 875 880 Ala Asp Val Asp Ser Glu LeuAla Ala Met Leu Leu Thr His Ala Arg 885 890 895 Gln Gly Lys Gly Pro GlnAsp Val Ser Arg Glu Ser Asp Ala Thr Arg 900 905 910 Arg Arg Lys Leu GluArg Met Arg Ser Val Arg Leu Gln Glu Ala Gly 915 920 925 Gly Asp Leu GlyArg Arg Gly Thr Ser Val Leu Ala Gln Gln Ser Val 930 935 940 Arg Thr GlnHis Leu Arg Asp Leu Gln Val Ile Ala Ala Tyr Arg Glu 945 950 955 960 ArgThr Lys Ala Glu Ser Ile Ala Ser Leu Leu Ser Leu Ala Ile Thr 965 970 975Thr Glu His Thr Leu His Ala Thr Leu Gly Val Ala Glu Phe Phe Glu 980 985990 Phe Val Leu Lys Asn Pro His Asn Thr Gln His Thr Val Thr Val Glu 9951000 1005 Ile Asp Asn Pro Glu Leu Ser Val Ile Val Asp Ser Gln Glu Trp1010 1015 1020 Arg Asp Phe Lys Gly Ala Ala Gly Leu His Thr Pro Val GluGlu 1025 1030 1035 Asp Met Phe His Leu Arg Gly Ser Leu Ala Pro Gln LeuTyr Leu 1040 1045 1050 Arg Pro His Glu Thr Ala His Val Pro Phe Lys PheGln Ser Phe 1055 1060 1065 Ser Ala Gly Gln Leu Ala Met Val Gln Ala SerPro Gly Leu Ser 1070 1075 1080 Asn Glu Lys Gly Met Asp Ala Val Ser ProTrp Lys Ser Ser Ala 1085 1090 1095 Val Pro Thr Lys His Ala Lys Val LeuPhe Arg Ala Ser Gly Gly 1100 1105 1110 Lys Pro Ile Ala Val Leu Cys LeuThr Val Glu Leu Gln Pro His 1115 1120 1125 Val Val Asp Gln Val Phe ArgPhe Tyr His Pro Glu Leu Ser Phe 1130 1135 1140 Leu Lys Lys Ala Ile ArgLeu Pro Pro Trp His Thr Phe Pro Gly 1145 1150 1155 Ala Pro Val Gly MetLeu Gly Glu Asp Pro Pro Val His Val Arg 1160 1165 1170 Cys Ser Asp ProAsn Val Ile Cys Glu Thr Gln Asn Val Gly Pro 1175 1180 1185 Gly Glu ProArg Asp Ile Phe Leu Lys Val Ala Ser Gly Pro Ser 1190 1195 1200 Pro GluIle Lys Asp Phe Phe Val Ile Ile Tyr Ser Asp Arg Trp 1205 1210 1215 LeuAla Thr Pro Thr Gln Thr Trp Gln Val Tyr Leu His Ser Leu 1220 1225 1230Gln Arg Val Asp Val Ser Cys Val Ala Gly Gln Leu Thr Arg Leu 1235 12401245 Ser Leu Val Leu Arg Gly Thr Gln Thr Val Arg Lys Val Arg Ala 12501255 1260 Phe Thr Ser His Pro Gln Glu Leu Lys Thr Asp Pro Lys Gly Val1265 1270 1275 Phe Val Leu Pro Pro Arg Gly Val Gln Asp Leu His Val GlyVal 1280 1285 1290 Arg Pro Leu Arg Ala Gly Ser Arg Phe Val His Leu AsnLeu Val 1295 1300 1305 Asp Val Asp Cys His Gln Leu Val Ala Ser Trp LeuVal Cys Leu 1310 1315 1320 Cys Cys Arg Gln Pro Leu Ile Ser Lys Ala PheGlu Ile Met Leu 1325 1330 1335 Ala Ala Gly Glu Gly Lys Gly Val Asn LysArg Ile Thr Tyr Thr 1340 1345 1350 Asn Pro Tyr Pro Ser Arg Arg Thr PheHis Leu His Ser Asp His 1355 1360 1365 Pro Glu Leu Leu Arg Phe Arg GluAsp Ser Phe Gln Val Gly Gly 1370 1375 1380 Gly Glu Thr Tyr Thr Ile GlyLeu Gln Phe Ala Pro Ser Gln Arg 1385 1390 1395 Val Gly Glu Glu Glu IleLeu Ile Tyr Ile Asn Asp His Glu Asp 1400 1405 1410 Lys Asn Glu Glu AlaPhe Cys Val Lys Val Ile Tyr Gln 1415 1420 1425 19 2636 DNA Homo sapiens19 gacgcgaggc gggttcttgg actgagtgtg cggcgcggtg cgccgccttc cgaggctcct 60cccgcgggtg gcagcggacg gggcgcgccc ctcggccagt cctcggtcct caggcttgtg 120gctccgttga gcaccggccg ccgggcctct gggtccgtcg agtggagact ctctgaaaag 180cgtgggctcc gtggcctccg gcgcggccgc ggcgggtcgg tctcctagat catccgggaa 240gcccacggga ccctcaggcg ggcaggatga acgactggca caggatcttc acccaaaacg 300tgcttgtccc tccccaccca cagagagcgc gccagccttg gaaggaatcc acggcattcc 360agtgtgtcct caagtggctg gacggaccgg taattaggca gggcgtgctg gaggtactgt 420cagaggttga atgccatctg cgagtgtctt tctttgatgt cacctaccgg cacttctttg 480ggaggacgtg gaaaaccaca gtgaagccga cgaagagacc gccgtccagg atcgtcttta 540atgagccctt gtattttcac acatccctaa accaccctca tatcgtggct gtggtggaag 600tggtcgctga gggcaagaaa cgggatggga gcctccagac attgtcctgt gggtttggaa 660ttcttcggat cttcagcaac cagccggact ctcctatctc tgcttcccag gacaaaaggt 720tgcggctgta ccatggcacc cccagagccc tcctgcaccc gcttctccag gaccccgcag 780agcaaaacag acacatgacc ctcattgaga actgcagcct gcagtacacg ctgaagccac 840acccggccct ggagcctgcg ttccaccttc ttcctgagaa ccttctggtg tctggtctgc 900agcagatacc tggcctgctt ccagctcatg gagaatccgg cgacgctctc cgaaagcctc 960gcctccagaa gcccatcacg gggcacttgg atgacttatt cttcaccctg tacccctccc 1020tggagaagtt tgaggaagag ctgctggagc tccacgtcca ggaccacttc caggagggat 1080gtggcccact ggacggtggt gccctggaga tcctggagcg gcgcctgcgt gtgggcgtgc 1140acaatggtct gggcttcgtg cagaggccgc aggtcgttgt actggtgcct gagatggatg 1200tggccttgac gcgctcagct agcttcagca ggaaagtggt ctcctcttcc aagaccagct 1260ccgggagcca agctctggtt ttgagaagcc gcctccgcct cccagagatg gtcggccacc 1320ctgcatttgc ggtcatcttc cagctggagt acgtgttcag cagccctgca ggagtggacg 1380gcaatgcagc ttcggtcacc tctctgtcca acctggcatg catgcacatg gtccgctggg 1440ctgtttggaa ccccttgctg gaagctgatt ctggaagggt gaccctgcct ctgcagggtg 1500ggatccagcc caacccctcg cactgtctgg tctacaaggt accctcagcc agcatgagct 1560ctgaagaggt gaagcaggtg gagtcgggta cactccggtt ccagttctcg ctgggctcag 1620aagaacacct ggatgcaccc acggagcctg tcagtggccc caaagtggag cggcggcctt 1680ccaggaaacc acccacgtcc ccttcgagcc cgccagcgcc agtacctcga gttctcgctg 1740ccccgcagaa ctcacctgtg ggaccagggt tgtcaatttc ccagctggcg gcctccccgc 1800ggtccccgac tcagcactgc ttggccaggc ctacttcaca gctaccccat ggctctcagg 1860cctccccggc ccaggcacag gagttcccgt tggaggccgg tatctcccac ctggaagccg 1920acctgagcca gacctccctg gtcctggaaa catccattgc cgaacagtta caggagctgc 1980cgttcacgcc tttgcatgcc cctattgttg tgggaaccca gaccaggagc tctgcagggc 2040agccctcgag agcctccatg gtgctcctgc agtcctccgg ctttcccgag attctggatg 2100ccaataaaca gccagccgag gctgtcagcg ctacagaacc tgtgacgttt aaccctcaga 2160aggaagaatc agattgtcta caaagcaacg agatggtgct acagtttctt gcctttagca 2220gagtggccca ggactgccga ggaacatcat ggccaaagac tgtgtatttc accttccagt 2280tctaccgctt cccacccgca acgacgccac gactgcagct ggtccagctg gatgaggccg 2340gccagcccag ctctggcgcc ctgacccaca tcctcgtgcc tgtgagcaga gatggcacct 2400ttgatgctgg gtctcctggc ttccagctga ggtacatggt gggccctggg ttcctgaagc 2460caggtgagcg gcgctgcttt gcccgctacc tggccgtgca gaccctgcag attgacgtct 2520gggacggaga ctccctgctg ctcatcggat ctgctgccgt ccagatgaag catctcctcc 2580gccaaggccg gccggctgtg caggcctccc acgagcttga ggtcgtggca acttaa 2636 20790 PRT Homo sapiens misc_feature (790)..(790) Xaa can be any naturallyoccurring amino acid 20 Met Asn Asp Trp His Arg Ile Phe Thr Gln Asn ValLeu Val Pro Pro 1 5 10 15 His Pro Gln Arg Ala Arg Gln Pro Trp Lys GluSer Thr Ala Phe Gln 20 25 30 Cys Val Leu Lys Trp Leu Asp Gly Pro Val IleArg Gln Gly Val Leu 35 40 45 Glu Val Leu Ser Glu Val Glu Cys His Leu ArgVal Ser Phe Phe Asp 50 55 60 Val Thr Tyr Arg His Phe Phe Gly Arg Thr TrpLys Thr Thr Val Lys 65 70 75 80 Pro Thr Lys Arg Pro Pro Ser Arg Ile ValPhe Asn Glu Pro Leu Tyr 85 90 95 Phe His Thr Ser Leu Asn His Pro His IleVal Ala Val Val Glu Val 100 105 110 Val Ala Glu Gly Lys Lys Arg Asp GlySer Leu Gln Thr Leu Ser Cys 115 120 125 Gly Phe Gly Ile Leu Arg Ile PheSer Asn Gln Pro Asp Ser Pro Ile 130 135 140 Ser Ala Ser Gln Asp Lys ArgLeu Arg Leu Tyr His Gly Thr Pro Arg 145 150 155 160 Ala Leu Leu His ProLeu Leu Gln Asp Pro Ala Glu Gln Asn Arg His 165 170 175 Met Thr Leu IleGlu Asn Cys Ser Leu Gln Tyr Thr Leu Lys Pro His 180 185 190 Pro Ala LeuGlu Pro Ala Phe His Leu Leu Pro Glu Asn Leu Leu Val 195 200 205 Ser GlyLeu Gln Gln Ile Pro Gly Leu Leu Pro Ala His Gly Glu Ser 210 215 220 GlyAsp Ala Leu Arg Lys Pro Arg Leu Gln Lys Pro Ile Thr Gly His 225 230 235240 Leu Asp Asp Leu Phe Phe Thr Leu Tyr Pro Ser Leu Glu Lys Phe Glu 245250 255 Glu Glu Leu Leu Glu Leu His Val Gln Asp His Phe Gln Glu Gly Cys260 265 270 Gly Pro Leu Asp Gly Gly Ala Leu Glu Ile Leu Glu Arg Arg LeuArg 275 280 285 Val Gly Val His Asn Gly Leu Gly Phe Val Gln Arg Pro GlnVal Val 290 295 300 Val Leu Val Pro Glu Met Asp Val Ala Leu Thr Arg SerAla Ser Phe 305 310 315 320 Ser Arg Lys Val Val Ser Ser Ser Lys Thr SerSer Gly Ser Gln Ala 325 330 335 Leu Val Leu Arg Ser Arg Leu Arg Leu ProGlu Met Val Gly His Pro 340 345 350 Ala Phe Ala Val Ile Phe Gln Leu GluTyr Val Phe Ser Ser Pro Ala 355 360 365 Gly Val Asp Gly Asn Ala Ala SerVal Thr Ser Leu Ser Asn Leu Ala 370 375 380 Cys Met His Met Val Arg TrpAla Val Trp Asn Pro Leu Leu Glu Ala 385 390 395 400 Asp Ser Gly Arg ValThr Leu Pro Leu Gln Gly Gly Ile Gln Pro Asn 405 410 415 Pro Ser His CysLeu Val Tyr Lys Val Pro Ser Ala Ser Met Ser Ser 420 425 430 Glu Glu ValLys Gln Val Glu Ser Gly Thr Leu Arg Phe Gln Phe Ser 435 440 445 Leu GlySer Glu Glu His Leu Asp Ala Pro Thr Glu Pro Val Ser Gly 450 455 460 ProLys Val Glu Arg Arg Pro Ser Arg Lys Pro Pro Thr Ser Pro Ser 465 470 475480 Ser Pro Pro Ala Pro Val Pro Arg Val Leu Ala Ala Pro Gln Asn Ser 485490 495 Pro Val Gly Pro Gly Leu Ser Ile Ser Gln Leu Ala Ala Ser Pro Arg500 505 510 Ser Pro Thr Gln His Cys Leu Ala Arg Pro Thr Ser Gln Leu ProHis 515 520 525 Gly Ser Gln Ala Ser Pro Ala Gln Ala Gln Glu Phe Pro LeuGlu Ala 530 535 540 Gly Ile Ser His Leu Glu Ala Asp Leu Ser Gln Thr SerLeu Val Leu 545 550 555 560 Glu Thr Ser Ile Ala Glu Gln Leu Gln Glu LeuPro Phe Thr Pro Leu 565 570 575 His Ala Pro Ile Val Val Gly Thr Gln ThrArg Ser Ser Ala Gly Gln 580 585 590 Pro Ser Arg Ala Ser Met Val Leu LeuGln Ser Ser Gly Phe Pro Glu 595 600 605 Ile Leu Asp Ala Asn Lys Gln ProAla Glu Ala Val Ser Ala Thr Glu 610 615 620 Pro Val Thr Phe Asn Pro GlnLys Glu Glu Ser Asp Cys Leu Gln Ser 625 630 635 640 Asn Glu Met Val LeuGln Phe Leu Ala Phe Ser Arg Val Ala Gln Asp 645 650 655 Cys Arg Gly ThrSer Trp Pro Lys Thr Val Tyr Phe Thr Phe Gln Phe 660 665 670 Tyr Arg PhePro Pro Ala Thr Thr Pro Arg Leu Gln Leu Val Gln Leu 675 680 685 Asp GluAla Gly Gln Pro Ser Ser Gly Ala Leu Thr His Ile Leu Val 690 695 700 ProVal Ser Arg Asp Gly Thr Phe Asp Ala Gly Ser Pro Gly Phe Gln 705 710 715720 Leu Arg Tyr Met Val Gly Pro Gly Phe Leu Lys Pro Gly Glu Arg Arg 725730 735 Cys Phe Ala Arg Tyr Leu Ala Val Gln Thr Leu Gln Ile Asp Val Trp740 745 750 Asp Gly Asp Ser Leu Leu Leu Ile Gly Ser Ala Ala Val Gln MetLys 755 760 765 His Leu Leu Arg Gln Gly Arg Pro Ala Val Gln Ala Ser HisGlu Leu 770 775 780 Glu Val Val Ala Thr Xaa 785 790 21 3558 DNA Homosapiens 21 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctgtttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggtgctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttgggagaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctgaaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgcagcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatgcaaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacag gagccctaagtgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaacaagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctcatcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgggcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgctccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgcagttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatataacgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgcacagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaaggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgtttttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgggcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacatagatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggccatgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgttatgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagacactcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacattgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaatccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctatatcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagagggaagaacag 1500 ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactgctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagata cacaccccttgattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgccctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaagggtacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttgagaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacagcaaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagcacccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaacgtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcggagccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggccaaatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaatgaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgagacagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatcagggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttccaccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgtgccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttctagccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccatcgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttgccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtccgaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagtgtgaatattg 2700 accttctccc cgtagagctc cgactgcaga taattcagag agaacgaaggaggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcgaagctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctggagatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaactggaactaa 2940 aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaagggcacctcag 3000 gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttctcacgaaggga 3060 aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctcaactcagtga 3120 gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaacttttcttata 3180 acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcctatggaggaag 3240 actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgctgataatctt 3300 ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcacctaaaaatgtgtt 3360 aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattgttttgtcaat 3420 caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagtgcctgagtgt 3480 ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaacagcaaaatcta 3540 aggaaaacta aaataaaa 3558 22 1065 PRT Homo sapiens 22 MetAsn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 7580 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 9095 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val AspThr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala TyrTyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His AspSer Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu HisTrp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val ArgCys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp GlnAsp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp GlyAsn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser CysAsn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His TrpAla Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu GluArg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala ThrPro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val LysVal Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln ValAsp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala CysGlu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys GlyGly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu LeuHis Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile LeuIle Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly ArgThr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys MetAla Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Gln AspLys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490 495 Cys Asn Asn Gly TyrLeu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro AsnGln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr AlaLeu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu HisGly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg Lys Glu Ala Glu Gln595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys Arg Pro Gln Ala LeuPro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro Ser Arg Gln Ser ArgAla Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly Asn Val Ala Gln GlyPro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro Gly Gly Ser Leu GlyGly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser Ser Asp Leu Gln GlyThr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala Arg Glu His Ser LysGly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro Asn Glu Gly Ser AspGly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser Val Glu Lys Ser ArgGly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 Lys Gly Lys Gly PheVal Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750 Pro Asp Glu LysGly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760 765 Pro His AspSer His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770 775 780 Lys AlaLys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly 785 790 795 800Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser Ala Arg Gly 805 810815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His Arg Thr Pro Arg 820825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly Leu Tyr Ser His Leu835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly Ala Arg Arg Leu GluThr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val Ser Lys Glu Thr AspPro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln Ser Val Asn Ile AspLeu Leu Pro Val 885 890 895 Glu Leu Arg Leu Gln Ile Ile Gln Arg Glu ArgArg Arg Lys Glu Leu 900 905 910 Phe Arg Lys Lys Asn Lys Ala Ala Ala ValIle Gln Arg Ala Trp Arg 915 920 925 Ser Tyr Gln Leu Arg Lys His Leu SerHis Leu Arg His Met Lys Gln 930 935 940 Leu Gly Ala Gly Asp Val Asp ArgTrp Arg Gln Glu Ser Thr Ala Leu 945 950 955 960 Leu Leu Gln Val Trp ArgLys Glu Leu Glu Leu Lys Phe Pro Gln Thr 965 970 975 Thr Ala Val Ser LysAla Pro Lys Ser Pro Ser Lys Gly Thr Ser Gly 980 985 990 Thr Lys Ser ThrLys His Ser Val Leu Lys Gln Ile Tyr Gly Cys Ser 995 1000 1005 His GluGly Lys Ile His His Pro Thr Arg Ser Val Lys Ala Ser 1010 1015 1020 SerVal Leu Arg Leu Asn Ser Val Ser Asn Leu Gln Cys Ile His 1025 1030 1035Leu Leu Glu Asn Ser Gly Arg Ser Lys Asn Phe Ser Tyr Asn Leu 1040 10451050 Gln Ser Ala Thr Gln Pro Lys Asn Lys Thr Lys Pro 1055 1060 1065 233558 DNA Homo sapiens 23 ggttgctccc ggttgctaag aagactatga acaagtcagagaacctgctg tttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatggagataagggt gctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaagatcagtttggg agaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcagatgctcttctg aaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccctccatcttgca gcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagcaaactggatg caaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacaggagccctaag tgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacacaggataaaaac aagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgtgaagctgctc atcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatcccacttcactgg gcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattctggatgctgct ccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctcttcactttgca gttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaagctgcaatata acgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttattaggccatgca cagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatctgacagccaa ggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggttaaagtgttt ttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatcctttatgtgg gcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaaatcggacata gatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgctctttctggc catgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgctactgatgtt atgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgtgattcagaca ctcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattctcttctacat tgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaataagatcaat ccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcatatggaggctat atcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattcaagacaaagag ggaagaacag 1500 ctttgcattg gtcctgcaac aatggatacc ttgatgccattaaattactg ctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagatacacacccctt gattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttggagcacggtgcc ctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgtctacaaaggg tacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagcatgaacagttg agaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaagaggcagaacag caaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctgtctgcctagc acccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcctgctggcaac gtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtctctaggcgga gccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactccagaaggcca aatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccacttcagacccaat gaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtccagaggtgag acagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagccctcctgtatc agggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgcaagccttcca ccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaaggccaaatgt gccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctccggctggttct agccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcctccccaccat cgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctctattcacatttg ccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatctaccctgtcc gaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctctgggcagagt gtgaatattg 2700 accttctccc cgtagagctc tgactgcaga taattcagagagaacgaagg aggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcgcgcctggcga agctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagcttggagctgga gatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttggaggaaggaa ctggaactaa 2940 aattccccca aaccactgca gtaagcaagg cccccaagagtccatccaag ggcacctcag 3000 gcacaaagtc caccaagcac tcagtgctta agcaaatctatggttgttct cacgaaggga 3060 aaatacatca tcctacaaga tctgtaaaag cctcttctgtgctgcgtctc aactcagtga 3120 gcaacctaca gtgtatacat ctccttgaga acagtggaagatcaaagaac ttttcttata 3180 acctgcaatc agctactcag ccaaaaaaca aaacaaaaccttgactgcct atggaggaag 3240 actgtgttcg ggggagctgg catagctagt gcagagttcagattttctgc tgataatctt 3300 ttacaccttg ggaaaacttt aatatccgta cctgaaggctgattcaccta aaaatgtgtt 3360 aactgaaaga aaatgtcaga atgtttcctt tctgctcttacacagcattg ttttgtcaat 3420 caacacagcc tgcactgaaa ggacctgcat agactatgtctgtgcaaagt gcctgagtgt 3480 ctgctttcac ctcagtctgt acagttggaa atgagaattcataattaaca gcaaaatcta 3540 aggaaaacta aaataaaa 3558 24 898 PRT Homosapiens 24 Met Asn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu AlaSer 1 5 10 15 Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala LeuGln Arg 20 25 30 Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp GlnPhe Gly 35 40 45 Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu AspCys Ala 50 55 60 Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr AspHis Ser 65 70 75 80 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly AsnTyr Arg Phe 85 90 95 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met GlnLys Asp Leu 100 105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg HisArg Ser Pro Lys 115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala ProGly Glu Val Asp Thr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu HisTrp Ser Ala Tyr Tyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu LeuIle Lys His Asp Ser Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly LysIle Pro Leu His Trp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala ValHis Thr Val Arg Cys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser LeuLeu Asn Trp Gln Asp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe AlaVal Ala Asp Gly Asn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr SerTyr Glu Ser Cys Asn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 ThrPro Leu His Trp Ala Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270Leu Leu Leu Glu Arg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280285 Gly Ala Thr Pro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290295 300 Val Lys Val Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu305 310 315 320 Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly SerAsp Asp 325 330 335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile AspIle Asn Met 340 345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala AlaAla Leu Ser Gly 355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu AsnAsn Ala Gln Val Asp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro LeuPhe Arg Ala Cys Glu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln ThrLeu Ile Lys Gly Gly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp GlyHis Ser Leu Leu His Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp ValCys Gln Ile Leu Ile Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln AspTyr Ala Gly Arg Thr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly TyrIle Asn Cys Met Ala Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp ProAsn Ile Gln Asp Lys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490 495 CysAsn Asn Gly Tyr Leu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510Ala Phe Pro Asn Gln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520525 Asp Tyr Ala Leu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530535 540 Glu His Gly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe545 550 555 560 Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys AlaPhe Arg 565 570 575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu ArgLys Asp Ala 580 585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg LysGlu Ala Glu Gln 595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys ArgPro Gln Ala Leu Pro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro SerArg Gln Ser Arg Ala Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly AsnVal Ala Gln Gly Pro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro GlyGly Ser Leu Gly Gly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser SerAsp Leu Gln Gly Thr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala ArgGlu His Ser Lys Gly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro AsnGlu Gly Ser Asp Gly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser ValGlu Lys Ser Arg Gly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 LysGly Lys Gly Phe Val Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750Pro Asp Glu Lys Gly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760765 Pro His Asp Ser His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770775 780 Lys Ala Lys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly785 790 795 800 Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser AlaArg Gly 805 810 815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His ArgThr Pro Arg 820 825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly LeuTyr Ser His Leu 835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly AlaArg Arg Leu Glu Thr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val SerLys Glu Thr Asp Pro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln SerVal Asn Ile Asp Leu Leu Pro Val 885 890 895 Glu Leu 25 3557 DNA Homosapiens 25 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctgtttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggtgctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttgggagaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctgaaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgcagcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatgcaaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacag gagccctaagtgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaacaagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctcatcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgggcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgctccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgcagttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatataacgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgcacagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaaggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgtttttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgggcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacatagatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggccatgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgttatgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagacactcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacattgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaatccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctatatcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattaa gacaaagagggaagaacagc 1500 tttgcattgg tcctgcaaca atggatacct tgatgccatt aaattactgctagactttgc 1560 tgctttccct aatcagatgg aaaacaatga agagagatac acaccccttgattatgcttt 1620 gcttggtgag cgccatgaag tgatccagtt catgttggag cacggtgccctgtccatcgc 1680 agccatacaa gacatcgccg ccttcaaaat ccaagctgtc tacaaagggtacaaggtcag 1740 aaaagccttc cgagacagga aaaatctcct catgaagcat gaacagttgagaaaagatgc 1800 tgctgccaaa aagcgagagg aagaaaacaa acgaaaagag gcagaacagcaaaaaggaag 1860 gcggagccca gattcctgca gaccccaggc ccttccctgt ctgcctagcacccaggatgt 1920 gcccagcagg cagagccggg cccccagcaa gcagcctcct gctggcaacgtggcccaagg 1980 ccctgagcca agagacagca gaggatctcc aggagggtct ctaggcggagccctccagaa 2040 ggagcagcat gtttcctcag atttgcaggg aacaaactcc agaaggccaaatgaaacagc 2100 cagagaacat tctaaaggcc aatctgcttg tgtccacttc agacccaatgaaggcagtga 2160 tggaagcagg catccaggag ttccctctgt tgagaagtcc agaggtgagacagctggcga 2220 tgagcggtgt gcaaagggga aaggtttcgt gaagcagccc tcctgtatcagggtggctgg 2280 gcctgatgag aaaggagagg actccaggcg ggcaggtgca agccttccaccgcacgatag 2340 ccactggaag cccagcaggc ggcatgacac agaacccaag gccaaatgtgccccccagaa 2400 aaggcgcact caagagctca gaggaggaag gtgctctccg gctggttctagccgccctgg 2460 cagtgcccgg ggggaggcgg tccatgctgg gcagaatcct ccccaccatcgtacaccaag 2520 aaacaaagtg acacaagcca agctcacagg agggctctat tcacatttgccacagagcac 2580 agaggagttg aggtcaggag ctaggaggct ggagacatct accctgtccgaggactttca 2640 ggtatctaag gagactgatc cagcacctgg tcccctctct gggcagagtgtgaatattga 2700 ccttctcccc gtagagctcc gactgcagat aattcagaga gaacgaaggaggaaggagct 2760 gtttcgcaaa aagaacaagg cagcagcagt catccagcgc gcctggcgaagctaccagct 2820 caggaagcac ctgtcccacc ttcggcatat gaagcagctt ggagctggagatgtggacag 2880 atggaggcaa gagtctacag cattgctcct ccaggtttgg aggaaggaactggaactaaa 2940 attcccccaa accactgcag taagcaaggc ccccaagagt ccatccaagggcacctcagg 3000 cacaaagtcc accaagcact cagtgcttaa gcaaatctat ggttgttctcacgaagggaa 3060 aatacatcat cctacaagat ctgtaaaagc ctcttctgtg ctgcgtctcaactcagtgag 3120 caacctacag tgtatacatc tccttgagaa cagtggaaga tcaaagaacttttcttataa 3180 cctgcaatca gctactcagc caaaaaacaa aacaaaacct tgactgcctatggaggaaga 3240 ctgtgttcgg gggagctggc atagctagtg cagagttcag attttctgctgataatcttt 3300 tacaccttgg gaaaacttta atatccgtac ctgaaggctg attcacctaaaaatgtgtta 3360 actgaaagaa aatgtcagaa tgtttccttt ctgctcttac acagcattgttttgtcaatc 3420 aacacagcct gcactgaaag gacctgcata gactatgtct gtgcaaagtgcctgagtgtc 3480 tgctttcacc tcagtctgta cagttggaaa tgagaattca taattaacagcaaaatctaa 3540 ggaaaactaa aataaaa 3557 26 510 PRT Homo sapiens 26 MetAsn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 7580 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 9095 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val AspThr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala TyrTyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His AspSer Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu HisTrp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val ArgCys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp GlnAsp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp GlyAsn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser CysAsn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His TrpAla Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu GluArg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala ThrPro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val LysVal Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln ValAsp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala CysGlu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys GlyGly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu LeuHis Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile LeuIle Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly ArgThr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys MetAla Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Lys GlnAsp Lys Glu Gly Arg Thr Ala Leu His Trp 485 490 495 Ser Cys Asn Asn GlyTyr Leu Asp Ala Ile Lys Leu Leu Leu 500 505 510 27 3558 DNA Homo sapiens27 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctg tttgctggtt 60catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggt gctctacaga 120ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttggg agaacaccac 180ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctg aaggcaggag 240cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgca gcccagaagg 300gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatg caaaaggatc 360tggaagagat gactcctttg cacttgacca cccggcacag gagccctaag tgtttggcac 420ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaac aagcaaacag 480ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctc atcaagcatg 540attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgg gcagccaacc 600ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgct ccaacagagt 660ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgca gttgctgatg 720ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatata acgtcttatg 780ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgca cagattgtcc 840atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaa ggagccacac 900ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgttt ttaaaacatc 960cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgg gcagctggca 1020aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacata gatattaaca 1080tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggc catgtcagca 1140ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgtt atgaaacata 1200ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagaca ctcattaaag 1260gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacat tgggcagcac 1320tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaat ccaaatgtcc 1380aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctat atcaactgca 1440tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagag ggaagaacag 1500ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactg ctagactttg 1560ctgctttccc taatcagatg gaaaacaatg aagagagata cacacccctt gattatgctt 1620tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgcc ctgtccatcg 1680cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaaggg tacaaggtca 1740gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttg agaaaagatg 1800ctgctgccaa aaagcgagag gaagaaaaca aatgaaaaga ggcagaacag caaaaaggaa 1860ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagc acccaggatg 1920tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaac gtggcccaag 1980gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcgga gccctccaga 2040aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggcca aatgaaacag 2100ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaat gaaggcagtg 2160atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgag acagctggcg 2220atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatc agggtggctg 2280ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttcca ccgcacgata 2340gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgt gccccccaga 2400aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttct agccgccctg 2460gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccat cgtacaccaa 2520gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttg ccacagagca 2580cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtcc gaggactttc 2640aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagt gtgaatattg 2700accttctccc cgtagagctc cgactgcaga taattcagag agaacgaagg aggaaggagc 2760tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcga agctaccagc 2820tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctgga gatgtggaca 2880gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaa ctggaactaa 2940aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaag ggcacctcag 3000gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttct cacgaaggga 3060aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctc aactcagtga 3120gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaac ttttcttata 3180acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcct atggaggaag 3240actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgc tgataatctt 3300ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcaccta aaaatgtgtt 3360aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattg ttttgtcaat 3420caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagt gcctgagtgt 3480ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaaca gcaaaatcta 3540aggaaaacta aaataaaa 3558 28 602 PRT Homo sapiens 28 Met Asn Lys Ser GluAsn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15 Gln Val His AlaAla Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30 Leu Ile Val GlyAsn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45 Arg Thr Pro LeuMet Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60 Asp Ala Leu LeuLys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 75 80 Gln Arg ThrAla Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 90 95 Met Lys LeuLeu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100 105 110 Glu GluMet Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys 115 120 125 CysLeu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val Asp Thr 130 135 140Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala Tyr Tyr Asn 145 150155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His Asp Ser Asn Ile Gly165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu His Trp Ala Ala AsnHis 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val Arg Cys Ile Leu AspAla Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp Gln Asp Tyr Glu GlyArg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp Gly Asn Val Thr ValVal Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser Cys Asn Ile Thr SerTyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His Trp Ala Ala Leu LeuGly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu Glu Arg Asn Lys SerGly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala Thr Pro Leu His TyrAla Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val Lys Val Phe Leu LysHis Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320 Glu Gly Arg ThrSer Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330 335 Val Leu ArgThr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340 345 350 Ala AspLys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly 355 360 365 HisVal Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln Val Asp 370 375 380Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala Cys Glu Met 385 390395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys Gly Gly Ala Arg Val405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu Leu His Trp Ala AlaLeu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile Leu Ile Glu Asn LysIle Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly Arg Thr Pro Leu GlnCys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys Met Ala Val Leu MetGlu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Gln Asp Lys Glu Gly ArgThr Ala Leu His Trp Ser 485 490 495 Cys Asn Asn Gly Tyr Leu Asp Ala IleLys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro Asn Gln Met Glu AsnAsn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr Ala Leu Leu Gly GluArg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu His Gly Ala Leu SerIle Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560 Lys Ile Gln AlaVal Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570 575 Asp Arg LysAsn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580 585 590 Ala AlaLys Lys Arg Glu Glu Glu Asn Lys 595 600 29 3558 DNA Homo sapiens 29ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctg tttgctggtt 60catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggt gctctacaga 120ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttggg agaacaccac 180ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctg aaggcaggag 240cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgca gcccagaagg 300gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatg caaaaggatc 360tggaagagat gactcctttg cacttgacca cccggcacag gagccctaag tgtttggcac 420ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaac aagcaaacag 480ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctc atcaagcatg 540attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgg gcagccaacc 600ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgct ccaacagagt 660ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgca gttgctgatg 720ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatata acgtcttatg 780ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgca cagattgtcc 840atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaa ggagccacac 900ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgttt ttaaaacatc 960cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgg gcagctggca 1020aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacata gatattaaca 1080tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggc catgtcagca 1140ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgtt atgaaacata 1200ctccactttt ctgagcctgt gagatgggac acaaagatgt gattcagaca ctcattaaag 1260gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacat tgggcagcac 1320tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaat ccaaatgtcc 1380aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctat atcaactgca 1440tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagag ggaagaacag 1500ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactg ctagactttg 1560ctgctttccc taatcagatg gaaaacaatg aagagagata cacacccctt gattatgctt 1620tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgcc ctgtccatcg 1680cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaaggg tacaaggtca 1740gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttg agaaaagatg 1800ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacag caaaaaggaa 1860ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagc acccaggatg 1920tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaac gtggcccaag 1980gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcgga gccctccaga 2040aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggcca aatgaaacag 2100ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaat gaaggcagtg 2160atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgag acagctggcg 2220atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatc agggtggctg 2280ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttcca ccgcacgata 2340gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgt gccccccaga 2400aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttct agccgccctg 2460gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccat cgtacaccaa 2520gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttg ccacagagca 2580cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtcc gaggactttc 2640aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagt gtgaatattg 2700accttctccc cgtagagctc cgactgcaga taattcagag agaacgaagg aggaaggagc 2760tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcga agctaccagc 2820tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctgga gatgtggaca 2880gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaa ctggaactaa 2940aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaag ggcacctcag 3000gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttct cacgaaggga 3060aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctc aactcagtga 3120gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaac ttttcttata 3180acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcct atggaggaag 3240actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgc tgataatctt 3300ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcaccta aaaatgtgtt 3360aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattg ttttgtcaat 3420caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagt gcctgagtgt 3480ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaaca gcaaaatcta 3540aggaaaacta aaataaaa 3558 30 395 PRT Homo sapiens 30 Met Asn Lys Ser GluAsn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15 Gln Val His AlaAla Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30 Leu Ile Val GlyAsn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45 Arg Thr Pro LeuMet Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60 Asp Ala Leu LeuLys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 75 80 Gln Arg ThrAla Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 90 95 Met Lys LeuLeu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100 105 110 Glu GluMet Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys 115 120 125 CysLeu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val Asp Thr 130 135 140Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala Tyr Tyr Asn 145 150155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His Asp Ser Asn Ile Gly165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu His Trp Ala Ala AsnHis 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val Arg Cys Ile Leu AspAla Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp Gln Asp Tyr Glu GlyArg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp Gly Asn Val Thr ValVal Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser Cys Asn Ile Thr SerTyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His Trp Ala Ala Leu LeuGly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu Glu Arg Asn Lys SerGly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala Thr Pro Leu His TyrAla Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val Lys Val Phe Leu LysHis Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320 Glu Gly Arg ThrSer Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330 335 Val Leu ArgThr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340 345 350 Ala AspLys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly 355 360 365 HisVal Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln Val Asp 370 375 380Ala Thr Asp Val Met Lys His Thr Pro Leu Phe 385 390 395 31 3558 DNA Homosapiens 31 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctgtttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggtgctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttgggagaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctgaaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgcagcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatgcaaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacag gagccctaagtgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaacaagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctcatcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgggcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgctccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgcagttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatataacgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgcacagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaaggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgtttttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgggcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacatagatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggccatgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgttatgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagacactcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacattgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaatccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctatatcaactgca 1440 tggcagttct catggaaaac aatgcagacc gtaacattca agacaaagagggaagaacag 1500 ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactgctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagata cacaccccttgattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgccctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaagggtacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttgagaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacagcaaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagcacccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaacgtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcggagccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggccaaatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaatgaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgagacagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatcagggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttccaccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgtgccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttctagccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccatcgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttgccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtccgaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagtgtgaatattg 2700 accttctccc cgtagagctc cgactgcaga taattcagag agaacgaaggaggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcgaagctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctggagatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaactggaactaa 2940 aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaagggcacctcag 3000 gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttctcacgaaggga 3060 aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctcaactcagtga 3120 gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaacttttcttata 3180 acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcctatggaggaag 3240 actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgctgataatctt 3300 ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcacctaaaaatgtgtt 3360 aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattgttttgtcaat 3420 caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagtgcctgagtgt 3480 ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaacagcaaaatcta 3540 aggaaaacta aaataaaa 3558 32 1065 PRT Homo sapiens 32 MetAsn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 7580 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 9095 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val AspThr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala TyrTyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His AspSer Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu HisTrp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val ArgCys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp GlnAsp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp GlyAsn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser CysAsn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His TrpAla Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu GluArg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala ThrPro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val LysVal Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln ValAsp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala CysGlu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys GlyGly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu LeuHis Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile LeuIle Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly ArgThr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys MetAla Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp Arg Asn Ile Gln AspLys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490 495 Cys Asn Asn Gly TyrLeu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro AsnGln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr AlaLeu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu HisGly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg Lys Glu Ala Glu Gln595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys Arg Pro Gln Ala LeuPro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro Ser Arg Gln Ser ArgAla Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly Asn Val Ala Gln GlyPro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro Gly Gly Ser Leu GlyGly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser Ser Asp Leu Gln GlyThr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala Arg Glu His Ser LysGly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro Asn Glu Gly Ser AspGly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser Val Glu Lys Ser ArgGly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 Lys Gly Lys Gly PheVal Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750 Pro Asp Glu LysGly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760 765 Pro His AspSer His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770 775 780 Lys AlaLys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly 785 790 795 800Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser Ala Arg Gly 805 810815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His Arg Thr Pro Arg 820825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly Leu Tyr Ser His Leu835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly Ala Arg Arg Leu GluThr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val Ser Lys Glu Thr AspPro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln Ser Val Asn Ile AspLeu Leu Pro Val 885 890 895 Glu Leu Arg Leu Gln Ile Ile Gln Arg Glu ArgArg Arg Lys Glu Leu 900 905 910 Phe Arg Lys Lys Asn Lys Ala Ala Ala ValIle Gln Arg Ala Trp Arg 915 920 925 Ser Tyr Gln Leu Arg Lys His Leu SerHis Leu Arg His Met Lys Gln 930 935 940 Leu Gly Ala Gly Asp Val Asp ArgTrp Arg Gln Glu Ser Thr Ala Leu 945 950 955 960 Leu Leu Gln Val Trp ArgLys Glu Leu Glu Leu Lys Phe Pro Gln Thr 965 970 975 Thr Ala Val Ser LysAla Pro Lys Ser Pro Ser Lys Gly Thr Ser Gly 980 985 990 Thr Lys Ser ThrLys His Ser Val Leu Lys Gln Ile Tyr Gly Cys Ser 995 1000 1005 His GluGly Lys Ile His His Pro Thr Arg Ser Val Lys Ala Ser 1010 1015 1020 SerVal Leu Arg Leu Asn Ser Val Ser Asn Leu Gln Cys Ile His 1025 1030 1035Leu Leu Glu Asn Ser Gly Arg Ser Lys Asn Phe Ser Tyr Asn Leu 1040 10451050 Gln Ser Ala Thr Gln Pro Lys Asn Lys Thr Lys Pro 1055 1060 1065 333557 DNA Homo sapiens 33 ggttgctccc ggttgctaag aagactatga acaagtcagagaacctgctg tttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatggagataagggt gctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaagatcagtttggg agaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcagatgctcttctg aaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccctccatcttgca gcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagcaaactggatg caaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacaggagccctaag tgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacacaggataaaaac aagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgtgaagctgctc atcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatcccacttcactgg gcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattctggatgctgct ccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctcttcactttgca gttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaagctgcaatata acgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttattaggccatgca cagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatctgacagccaa ggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggttaaagtgttt ttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatcctttatgtgg gcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaaatcggacata gatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgctctttctggc catgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgctactgatgtt atgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgtgattcagaca ctcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattctcttctacat tgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaataagatcaat ccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcatatggaggctat atcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattcaagacaaagag ggaagaacag 1500 ctttgcattg gtcctgcaac aatggatacc ttgatgccattaaattactg ctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagatacacacccctt gattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttggagcacggtgcc ctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgtctacaaaggg tacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagcatgaacagttg agaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaagaggcagaacag caaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctgtctgcctagc acccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcctgctggcaac gtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtctctaggcgga gccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactccagaaggcca aatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccacttcagacccaat gaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtccagaggtgag acagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagccctcctgtatc agggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgcaagccttcca ccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaaggccaaatgt gccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctccggctggttct agccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcctccccaccat cgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctctattcacatttg ccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatctaccctgtcc gaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctctgggcagagt gtgaatattg 2700 accttctccc cgtagagctc cgactgcaga taattcagagagaacgaagg aggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcgcgcctggcga agctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagcttggagctgga gatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttggaggaaggaa ctgaactaaa 2940 attcccccaa accactgcag taagcaaggc ccccaagagtccatccaagg gcacctcagg 3000 cacaaagtcc accaagcact cagtgcttaa gcaaatctatggttgttctc acgaagggaa 3060 aatacatcat cctacaagat ctgtaaaagc ctcttctgtgctgcgtctca actcagtgag 3120 caacctacag tgtatacatc tccttgagaa cagtggaagatcaaagaact tttcttataa 3180 cctgcaatca gctactcagc caaaaaacaa aacaaaaccttgactgccta tggaggaaga 3240 ctgtgttcgg gggagctggc atagctagtg cagagttcagattttctgct gataatcttt 3300 tacaccttgg gaaaacttta atatccgtac ctgaaggctgattcacctaa aaatgtgtta 3360 actgaaagaa aatgtcagaa tgtttccttt ctgctcttacacagcattgt tttgtcaatc 3420 aacacagcct gcactgaaag gacctgcata gactatgtctgtgcaaagtg cctgagtgtc 3480 tgctttcacc tcagtctgta cagttggaaa tgagaattcataattaacag caaaatctaa 3540 ggaaaactaa aataaaa 3557 34 970 PRT Homosapiens 34 Met Asn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu AlaSer 1 5 10 15 Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala LeuGln Arg 20 25 30 Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp GlnPhe Gly 35 40 45 Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu AspCys Ala 50 55 60 Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr AspHis Ser 65 70 75 80 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly AsnTyr Arg Phe 85 90 95 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met GlnLys Asp Leu 100 105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg HisArg Ser Pro Lys 115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala ProGly Glu Val Asp Thr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu HisTrp Ser Ala Tyr Tyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu LeuIle Lys His Asp Ser Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly LysIle Pro Leu His Trp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala ValHis Thr Val Arg Cys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser LeuLeu Asn Trp Gln Asp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe AlaVal Ala Asp Gly Asn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr SerTyr Glu Ser Cys Asn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 ThrPro Leu His Trp Ala Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270Leu Leu Leu Glu Arg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280285 Gly Ala Thr Pro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290295 300 Val Lys Val Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu305 310 315 320 Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly SerAsp Asp 325 330 335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile AspIle Asn Met 340 345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala AlaAla Leu Ser Gly 355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu AsnAsn Ala Gln Val Asp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro LeuPhe Arg Ala Cys Glu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln ThrLeu Ile Lys Gly Gly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp GlyHis Ser Leu Leu His Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp ValCys Gln Ile Leu Ile Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln AspTyr Ala Gly Arg Thr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly TyrIle Asn Cys Met Ala Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp ProAsn Ile Gln Asp Lys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490 495 CysAsn Asn Gly Tyr Leu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510Ala Phe Pro Asn Gln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520525 Asp Tyr Ala Leu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530535 540 Glu His Gly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe545 550 555 560 Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys AlaPhe Arg 565 570 575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu ArgLys Asp Ala 580 585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg LysGlu Ala Glu Gln 595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys ArgPro Gln Ala Leu Pro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro SerArg Gln Ser Arg Ala Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly AsnVal Ala Gln Gly Pro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro GlyGly Ser Leu Gly Gly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser SerAsp Leu Gln Gly Thr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala ArgGlu His Ser Lys Gly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro AsnGlu Gly Ser Asp Gly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser ValGlu Lys Ser Arg Gly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 LysGly Lys Gly Phe Val Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750Pro Asp Glu Lys Gly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760765 Pro His Asp Ser His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770775 780 Lys Ala Lys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly785 790 795 800 Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser AlaArg Gly 805 810 815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His ArgThr Pro Arg 820 825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly LeuTyr Ser His Leu 835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly AlaArg Arg Leu Glu Thr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val SerLys Glu Thr Asp Pro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln SerVal Asn Ile Asp Leu Leu Pro Val 885 890 895 Glu Leu Arg Leu Gln Ile IleGln Arg Glu Arg Arg Arg Lys Glu Leu 900 905 910 Phe Arg Lys Lys Asn LysAla Ala Ala Val Ile Gln Arg Ala Trp Arg 915 920 925 Ser Tyr Gln Leu ArgLys His Leu Ser His Leu Arg His Met Lys Gln 930 935 940 Leu Gly Ala GlyAsp Val Asp Arg Trp Arg Gln Glu Ser Thr Ala Leu 945 950 955 960 Leu LeuGln Val Trp Arg Lys Glu Leu Glu 965 970 35 3558 DNA Homo sapiens 35ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctg tttgctggtt 60catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggt gctctacaga 120ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttggg agaacaccac 180ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctg aaggcaggag 240cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgca gcccagaagg 300gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatg caaaaggatc 360tggaagagat gactcctttg cacttgacca cccggcacag gagccctaag tgtttggcac 420ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaac aagcaaacag 480ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctc atcaagcatg 540attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgg gcagccaacc 600ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgct ccaacagagt 660ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgca gttgctgatg 720ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatata acgtcttatg 780ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgca cagattgtcc 840atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaa ggagccacac 900ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgttt ttaaaacatc 960cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgg gcagctggca 1020aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacata gatattaaca 1080tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggc catgtcagca 1140ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgtt atgaaacata 1200ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagaca ctcattaaag 1260gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacat tgggcagcac 1320tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaat ccaaatgtcc 1380aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctat atcaactgca 1440tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagag ggaagaacag 1500ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactg ctagactttg 1560ctgctttccc taatcagatg gaaaacaatg aagagagata cacacccctt gattatgctt 1620tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgcc ctgtccatcg 1680cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaaggg tacaaggtca 1740gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttg agaaaagatg 1800ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacag caaaaaggaa 1860ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagc acccaggatg 1920tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaac gtggcccaag 1980gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcgga gccctccaga 2040aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggcca aatgaaacag 2100ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaat gaaggcagtg 2160atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgag acagctggcg 2220atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatc agggtggctg 2280ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttcca ccgcacgata 2340gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgt gccccccaga 2400aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttct agccgccctg 2460gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccat cgtacaccaa 2520gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttg ccacagagca 2580cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtcc gaggactttc 2640aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagt gtgaatattg 2700accttctccc cgtagagctc cgactgcaga taattcagag agaatgaagg aggaaggagc 2760tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcga agctaccagc 2820tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctgga gatgtggaca 2880gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaa ctggaactaa 2940aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaag ggcacctcag 3000gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttct cacgaaggga 3060aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctc aactcagtga 3120gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaac ttttcttata 3180acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcct atggaggaag 3240actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgc tgataatctt 3300ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcaccta aaaatgtgtt 3360aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattg ttttgtcaat 3420caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagt gcctgagtgt 3480ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaaca gcaaaatcta 3540aggaaaacta aaataaaa 3558 36 906 PRT Homo sapiens 36 Met Asn Lys Ser GluAsn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15 Gln Val His AlaAla Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30 Leu Ile Val GlyAsn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45 Arg Thr Pro LeuMet Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60 Asp Ala Leu LeuLys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 75 80 Gln Arg ThrAla Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 90 95 Met Lys LeuLeu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100 105 110 Glu GluMet Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys 115 120 125 CysLeu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val Asp Thr 130 135 140Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala Tyr Tyr Asn 145 150155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His Asp Ser Asn Ile Gly165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu His Trp Ala Ala AsnHis 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val Arg Cys Ile Leu AspAla Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp Gln Asp Tyr Glu GlyArg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp Gly Asn Val Thr ValVal Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser Cys Asn Ile Thr SerTyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His Trp Ala Ala Leu LeuGly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu Glu Arg Asn Lys SerGly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala Thr Pro Leu His TyrAla Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val Lys Val Phe Leu LysHis Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320 Glu Gly Arg ThrSer Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330 335 Val Leu ArgThr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340 345 350 Ala AspLys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly 355 360 365 HisVal Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln Val Asp 370 375 380Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala Cys Glu Met 385 390395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys Gly Gly Ala Arg Val405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu Leu His Trp Ala AlaLeu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile Leu Ile Glu Asn LysIle Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly Arg Thr Pro Leu GlnCys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys Met Ala Val Leu MetGlu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Gln Asp Lys Glu Gly ArgThr Ala Leu His Trp Ser 485 490 495 Cys Asn Asn Gly Tyr Leu Asp Ala IleLys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro Asn Gln Met Glu AsnAsn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr Ala Leu Leu Gly GluArg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu His Gly Ala Leu SerIle Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560 Lys Ile Gln AlaVal Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570 575 Asp Arg LysAsn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580 585 590 Ala AlaLys Lys Arg Glu Glu Glu Asn Lys Arg Lys Glu Ala Glu Gln 595 600 605 GlnLys Gly Arg Arg Ser Pro Asp Ser Cys Arg Pro Gln Ala Leu Pro 610 615 620Cys Leu Pro Ser Thr Gln Asp Val Pro Ser Arg Gln Ser Arg Ala Pro 625 630635 640 Ser Lys Gln Pro Pro Ala Gly Asn Val Ala Gln Gly Pro Glu Pro Arg645 650 655 Asp Ser Arg Gly Ser Pro Gly Gly Ser Leu Gly Gly Ala Leu GlnLys 660 665 670 Glu Gln His Val Ser Ser Asp Leu Gln Gly Thr Asn Ser ArgArg Pro 675 680 685 Asn Glu Thr Ala Arg Glu His Ser Lys Gly Gln Ser AlaCys Val His 690 695 700 Phe Arg Pro Asn Glu Gly Ser Asp Gly Ser Arg HisPro Gly Val Pro 705 710 715 720 Ser Val Glu Lys Ser Arg Gly Glu Thr AlaGly Asp Glu Arg Cys Ala 725 730 735 Lys Gly Lys Gly Phe Val Lys Gln ProSer Cys Ile Arg Val Ala Gly 740 745 750 Pro Asp Glu Lys Gly Glu Asp SerArg Arg Ala Gly Ala Ser Leu Pro 755 760 765 Pro His Asp Ser His Trp LysPro Ser Arg Arg His Asp Thr Glu Pro 770 775 780 Lys Ala Lys Cys Ala ProGln Lys Arg Arg Thr Gln Glu Leu Arg Gly 785 790 795 800 Gly Arg Cys SerPro Ala Gly Ser Ser Arg Pro Gly Ser Ala Arg Gly 805 810 815 Glu Ala ValHis Ala Gly Gln Asn Pro Pro His His Arg Thr Pro Arg 820 825 830 Asn LysVal Thr Gln Ala Lys Leu Thr Gly Gly Leu Tyr Ser His Leu 835 840 845 ProGln Ser Thr Glu Glu Leu Arg Ser Gly Ala Arg Arg Leu Glu Thr 850 855 860Ser Thr Leu Ser Glu Asp Phe Gln Val Ser Lys Glu Thr Asp Pro Ala 865 870875 880 Pro Gly Pro Leu Ser Gly Gln Ser Val Asn Ile Asp Leu Leu Pro Val885 890 895 Glu Leu Arg Leu Gln Ile Ile Gln Arg Glu 900 905 37 3558 DNAHomo sapiens 37 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctgtttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggtgctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttgggagaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctgaaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgcagcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatgcaaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacag gagccctaagtgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaacaagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctcatcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgggcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgctccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgcagttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatataacgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgcacagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaaggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgtttttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgggcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacatagatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggccatgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgttatgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagacactcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacattgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaatccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctatatcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagagggaagaacag 1500 ctttgcattg gtcctgcaac aatggatacc ttgatgccat taaattactgctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagata cacaccccttgattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgccctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaagggtacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttgagaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacagcaaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagcacccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaacgtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcggagccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggccaaatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaatgaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgagacagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatcagggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttccaccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgtgccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttctagccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccatcgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttgccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtccgaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagtgtgaatattg 2700 accttctccc cgtagagctc cgactgcaga taattcagag agaatgaaggaggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcgaagctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctggagatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaactggaactaa 2940 aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaagggcacctcag 3000 gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttctcacgaaggga 3060 aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctcaactcagtga 3120 gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaacttttcttata 3180 acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcctatggaggaag 3240 actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgctgataatctt 3300 ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcacctaaaaatgtgtt 3360 aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattgttttgtcaat 3420 caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagtgcctgagtgt 3480 ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaacagcaaaatcta 3540 aggaaaacta aaataaaa 3558 38 906 PRT Homo sapiens 38 MetAsn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 7580 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 9095 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val AspThr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala TyrTyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His AspSer Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu HisTrp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val ArgCys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp GlnAsp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp GlyAsn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser CysAsn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His TrpAla Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu GluArg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala ThrPro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val LysVal Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln ValAsp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala CysGlu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys GlyGly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu LeuHis Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile LeuIle Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly ArgThr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys MetAla Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Gln AspLys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490 495 Cys Asn Asn Gly TyrLeu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro AsnGln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr AlaLeu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu HisGly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg Lys Glu Ala Glu Gln595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys Arg Pro Gln Ala LeuPro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro Ser Arg Gln Ser ArgAla Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly Asn Val Ala Gln GlyPro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro Gly Gly Ser Leu GlyGly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser Ser Asp Leu Gln GlyThr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala Arg Glu His Ser LysGly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro Asn Glu Gly Ser AspGly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser Val Glu Lys Ser ArgGly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 Lys Gly Lys Gly PheVal Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750 Pro Asp Glu LysGly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760 765 Pro His AspSer His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770 775 780 Lys AlaLys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly 785 790 795 800Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser Ala Arg Gly 805 810815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His Arg Thr Pro Arg 820825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly Leu Tyr Ser His Leu835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly Ala Arg Arg Leu GluThr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val Ser Lys Glu Thr AspPro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln Ser Val Asn Ile AspLeu Leu Pro Val 885 890 895 Glu Leu Arg Leu Gln Ile Ile Gln Arg Glu 900905 39 3559 DNA Homo sapiens 39 ggttgctccc ggttgctaag aagactatgaacaagtcaga gaacctgctg tttgctggtt 60 catcattagc atcacaagtc catgctgctgccgttaatgg agataagggt gctctacaga 120 ggctcatcgt aggaaactct gctcttaaagacaaagaaga tcagtttggg agaacaccac 180 ttatgtattg cgtgttggct gacagattggattgtgcaga tgctcttctg aaggcaggag 240 cagatgtgaa taaaactgac catagccagagaacagccct ccatcttgca gcccagaagg 300 gaaattatcg tttcatgaaa ctcttacttacacgcagagc aaactggatg caaaaggatc 360 tggaagagat gactcctttg cacttgaccacccggcacag gagccctaag tgtttggcac 420 ttctgctgaa gtttatggca ccaggagaagtggatacaca ggataaaaac aagcaaacag 480 ctctgcattg gagtgcctac tacaataaccctgagcatgt gaagctgctc atcaagcatg 540 attctaacat tgggattcct gatgttgaaggcaagatccc acttcactgg gcagccaacc 600 ataaagatcc aagtgctgtt cacacagtgagatgcattct ggatgctgct ccaacagagt 660 ctttactgaa ctggcaagac tacgagggtcgaactcctct tcactttgca gttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacctcatatgaaag ctgcaatata acgtcttatg 780 ataacttatt tcgaacccca ctgcactgggcagctttatt aggccatgca cagattgtcc 840 atctcctttt agaaagaaat aagtctggaactatcccatc tgacagccaa ggagccacac 900 ctttgcacta tgctgctcag agtaactttgctgaaacggt taaagtgttt ttaaaacatc 960 cttcagtgaa agatgattca gacctggaaggaagaacatc ctttatgtgg gcagctggca 1020 aaggcagtga tgatgtcctt agaactatgctgagcttaaa atcggacata gatattaaca 1080 tggctgacaa atatggaggt acagctttgcatgctgctgc tctttctggc catgtcagca 1140 ccgtgaagtt attactggaa aataatgctcaagtagatgc tactgatgtt atgaaacata 1200 ctccactttt ccgagcctgt gagatgggacacaaagatgt gattcagaca ctcattaaag 1260 gtggagcaag ggtagatcta gttgaccaagatggacattc tcttctacat tgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatattaatagaaaa taagatcaat ccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagtgtgcagcata tggaggctat atcaactgca 1440 tggcagttct catggaaaac aatgcagaccctaacattca agacaaagag ggaagaacag 1500 ctttgcattg gtcctgcaac aatggataccttgatgccat taaattactg ctagactttg 1560 ctgctttccc taatcagatg gaaaacaatgaagagagata cacacccctt gattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagttcatgttgga gcacggtgcc ctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaatccaagctgt ctacaaaggg tacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcctcatgaagca tgaacagttg agaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaacaaacgaaaaga ggcagaacag caaaaaggaa 1860 ggcggagccc agattcctgc agaccccaggcccttccctg tctgcctagc acccaggatg 1920 tgcccagcag gcagagccgg gcccccagcaagcagcctcc tgctggcaac gtggcccaag 1980 gccctgagcc aagagacagc agaggatctccaggagggtc tctaggcgga gccctccaga 2040 aggagcagca tgtttcctca gatttgcagggaacaaactc cagaaggcca aatgaaacag 2100 ccagagaaca ttctaaaggc caatctgcttgtgtccactt cagacccaat gaaggcagtg 2160 atggaagcag gcatccagga gttccctctgttgagaagtc cagaggtgag acagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcgtgaagcagcc ctcctgtatc agggtggctg 2280 ggcctgatga gaaaggagag gactccaggcgggcaggtgc aagccttcca ccgcacgata 2340 gccactggaa gcccagcagg cggcatgacacagaacccaa ggccaaatgt gccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaaggtgctctcc ggctggttct agccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctgggcagaatcc tccccaccat cgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacaggagggctcta ttcacatttg ccacagagca 2580 cagaggagtt gaggtcagga gctaggaggctggagacatc taccctgtcc gaggactttc 2640 aggtatctaa ggagactgat ccagcacctggtcccctctc tgggcagagt gtgaatattg 2700 accttctccc cgtagagctc cgactgcagataattcagag agaacgaagg aggaaggagc 2760 tgtttcgcaa aaaagaacaa ggcagcagcagtcatccagc gcgcctggcg aagctaccag 2820 ctcaggaagc acctgtccca ccttcggcatatgaagcagc ttggagctgg agatgtggac 2880 agatggaggc aagagtctac agcattgctcctccaggttt ggaggaagga actggaacta 2940 aaattccccc aaaccactgc agtaagcaaggcccccaaga gtccatccaa gggcacctca 3000 ggcacaaagt ccaccaagca ctcagtgcttaagcaaatct atggttgttc tcacgaaggg 3060 aaaatacatc atcctacaag atctgtaaaagcctcttctg tgctgcgtct caactcagtg 3120 agcaacctac agtgtataca tctccttgagaacagtggaa gatcaaagaa cttttcttat 3180 aacctgcaat cagctactca gccaaaaaacaaaacaaaac cttgactgcc tatggaggaa 3240 gactgtgttc gggggagctg gcatagctagtgcagagttc agattttctg ctgataatct 3300 tttacacctt gggaaaactt taatatccgtacctgaaggc tgattcacct aaaaatgtgt 3360 taactgaaag aaaatgtcag aatgtttcctttctgctctt acacagcatt gttttgtcaa 3420 tcaacacagc ctgcactgaa aggacctgcatagactatgt ctgtgcaaag tgcctgagtg 3480 tctgctttca cctcagtctg tacagttggaaatgagaatt cataattaac agcaaaatct 3540 aaggaaaact aaaataaaa 3559 40 1001PRT Homo sapiens 40 Met Asn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser SerLeu Ala Ser 1 5 10 15 Gln Val His Ala Ala Ala Val Asn Gly Asp Lys GlyAla Leu Gln Arg 20 25 30 Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys GluAsp Gln Phe Gly 35 40 45 Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp ArgLeu Asp Cys Ala 50 55 60 Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn LysThr Asp His Ser 65 70 75 80 Gln Arg Thr Ala Leu His Leu Ala Ala Gln LysGly Asn Tyr Arg Phe 85 90 95 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn TrpMet Gln Lys Asp Leu 100 105 110 Glu Glu Met Thr Pro Leu His Leu Thr ThrArg His Arg Ser Pro Lys 115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe MetAla Pro Gly Glu Val Asp Thr 130 135 140 Gln Asp Lys Asn Lys Gln Thr AlaLeu His Trp Ser Ala Tyr Tyr Asn 145 150 155 160 Asn Pro Glu His Val LysLeu Leu Ile Lys His Asp Ser Asn Ile Gly 165 170 175 Ile Pro Asp Val GluGly Lys Ile Pro Leu His Trp Ala Ala Asn His 180 185 190 Lys Asp Pro SerAla Val His Thr Val Arg Cys Ile Leu Asp Ala Ala 195 200 205 Pro Thr GluSer Leu Leu Asn Trp Gln Asp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu HisPhe Ala Val Ala Asp Gly Asn Val Thr Val Val Asp Val Leu 225 230 235 240Thr Ser Tyr Glu Ser Cys Asn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250255 Thr Pro Leu His Trp Ala Ala Leu Leu Gly His Ala Gln Ile Val His 260265 270 Leu Leu Leu Glu Arg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln275 280 285 Gly Ala Thr Pro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala GluThr 290 295 300 Val Lys Val Phe Leu Lys His Pro Ser Val Lys Asp Asp SerAsp Leu 305 310 315 320 Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly LysGly Ser Asp Asp 325 330 335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser AspIle Asp Ile Asn Met 340 345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu HisAla Ala Ala Leu Ser Gly 355 360 365 His Val Ser Thr Val Lys Leu Leu LeuGlu Asn Asn Ala Gln Val Asp 370 375 380 Ala Thr Asp Val Met Lys His ThrPro Leu Phe Arg Ala Cys Glu Met 385 390 395 400 Gly His Lys Asp Val IleGln Thr Leu Ile Lys Gly Gly Ala Arg Val 405 410 415 Asp Leu Val Asp GlnAsp Gly His Ser Leu Leu His Trp Ala Ala Leu 420 425 430 Gly Gly Asn AlaAsp Val Cys Gln Ile Leu Ile Glu Asn Lys Ile Asn 435 440 445 Pro Asn ValGln Asp Tyr Ala Gly Arg Thr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr GlyGly Tyr Ile Asn Cys Met Ala Val Leu Met Glu Asn Asn Ala 465 470 475 480Asp Pro Asn Ile Gln Asp Lys Glu Gly Arg Thr Ala Leu His Trp Ser 485 490495 Cys Asn Asn Gly Tyr Leu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500505 510 Ala Phe Pro Asn Gln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu515 520 525 Asp Tyr Ala Leu Leu Gly Glu Arg His Glu Val Ile Gln Phe MetLeu 530 535 540 Glu His Gly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile AlaAla Phe 545 550 555 560 Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val ArgLys Ala Phe Arg 565 570 575 Asp Arg Lys Asn Leu Leu Met Lys His Glu GlnLeu Arg Lys Asp Ala 580 585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn LysArg Lys Glu Ala Glu Gln 595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp SerCys Arg Pro Gln Ala Leu Pro 610 615 620 Cys Leu Pro Ser Thr Gln Asp ValPro Ser Arg Gln Ser Arg Ala Pro 625 630 635 640 Ser Lys Gln Pro Pro AlaGly Asn Val Ala Gln Gly Pro Glu Pro Arg 645 650 655 Asp Ser Arg Gly SerPro Gly Gly Ser Leu Gly Gly Ala Leu Gln Lys 660 665 670 Glu Gln His ValSer Ser Asp Leu Gln Gly Thr Asn Ser Arg Arg Pro 675 680 685 Asn Glu ThrAla Arg Glu His Ser Lys Gly Gln Ser Ala Cys Val His 690 695 700 Phe ArgPro Asn Glu Gly Ser Asp Gly Ser Arg His Pro Gly Val Pro 705 710 715 720Ser Val Glu Lys Ser Arg Gly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730735 Lys Gly Lys Gly Phe Val Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740745 750 Pro Asp Glu Lys Gly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro755 760 765 Pro His Asp Ser His Trp Lys Pro Ser Arg Arg His Asp Thr GluPro 770 775 780 Lys Ala Lys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu LeuArg Gly 785 790 795 800 Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro GlySer Ala Arg Gly 805 810 815 Glu Ala Val His Ala Gly Gln Asn Pro Pro HisHis Arg Thr Pro Arg 820 825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr GlyGly Leu Tyr Ser His Leu 835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg SerGly Ala Arg Arg Leu Glu Thr 850 855 860 Ser Thr Leu Ser Glu Asp Phe GlnVal Ser Lys Glu Thr Asp Pro Ala 865 870 875 880 Pro Gly Pro Leu Ser GlyGln Ser Val Asn Ile Asp Leu Leu Pro Val 885 890 895 Glu Leu Arg Leu GlnIle Ile Gln Arg Glu Arg Arg Arg Lys Glu Leu 900 905 910 Phe Arg Lys LysGlu Gln Gly Ser Ser Ser His Pro Ala Arg Leu Ala 915 920 925 Lys Leu ProAla Gln Glu Ala Pro Val Pro Pro Ser Ala Tyr Glu Ala 930 935 940 Ala TrpSer Trp Arg Cys Gly Gln Met Glu Ala Arg Val Tyr Ser Ile 945 950 955 960Ala Pro Pro Gly Leu Glu Glu Gly Thr Gly Thr Lys Ile Pro Pro Asn 965 970975 His Cys Ser Lys Gln Gly Pro Gln Glu Ser Ile Gln Gly His Leu Arg 980985 990 His Lys Val His Gln Ala Leu Ser Ala 995 1000 41 3558 DNA Homosapiens 41 ggttgctccc ggttgctaag aagactatga acaagtcaga gaacctgctgtttgctggtt 60 catcattagc atcacaagtc catgctgctg ccgttaatgg agataagggtgctctacaga 120 ggctcatcgt aggaaactct gctcttaaag acaaagaaga tcagtttgggagaacaccac 180 ttatgtattg cgtgttggct gacagattgg attgtgcaga tgctcttctgaaggcaggag 240 cagatgtgaa taaaactgac catagccaga gaacagccct ccatcttgcagcccagaagg 300 gaaattatcg tttcatgaaa ctcttactta cacgcagagc aaactggatgcaaaaggatc 360 tggaagagat gactcctttg cacttgacca cccggcacag gagccctaagtgtttggcac 420 ttctgctgaa gtttatggca ccaggagaag tggatacaca ggataaaaacaagcaaacag 480 ctctgcattg gagtgcctac tacaataacc ctgagcatgt gaagctgctcatcaagcatg 540 attctaacat tgggattcct gatgttgaag gcaagatccc acttcactgggcagccaacc 600 ataaagatcc aagtgctgtt cacacagtga gatgcattct ggatgctgctccaacagagt 660 ctttactgaa ctggcaagac tacgagggtc gaactcctct tcactttgcagttgctgatg 720 ggaatgtgac cgtggttgat gtcttgacct catatgaaag ctgcaatataacgtcttatg 780 ataacttatt tcgaacccca ctgcactggg cagctttatt aggccatgcacagattgtcc 840 atctcctttt agaaagaaat aagtctggaa ctatcccatc tgacagccaaggagccacac 900 ctttgcacta tgctgctcag agtaactttg ctgaaacggt taaagtgtttttaaaacatc 960 cttcagtgaa agatgattca gacctggaag gaagaacatc ctttatgtgggcagctggca 1020 aaggcagtga tgatgtcctt agaactatgc tgagcttaaa atcggacatagatattaaca 1080 tggctgacaa atatggaggt acagctttgc atgctgctgc tctttctggccatgtcagca 1140 ccgtgaagtt attactggaa aataatgctc aagtagatgc tactgatgttatgaaacata 1200 ctccactttt ccgagcctgt gagatgggac acaaagatgt gattcagacactcattaaag 1260 gtggagcaag ggtagatcta gttgaccaag atggacattc tcttctacattgggcagcac 1320 tgggaggaaa tgctgatgtt tgccagatat taatagaaaa taagatcaatccaaatgtcc 1380 aggattatgc aggaagaacc cctttgcagt gtgcagcata tggaggctatatcaactgca 1440 tggcagttct catggaaaac aatgcagacc ctaacattca agacaaagagggaagaacag 1500 cttcgcattg gtcctgcaac aatggatacc ttgatgccat taaattactgctagactttg 1560 ctgctttccc taatcagatg gaaaacaatg aagagagata cacaccccttgattatgctt 1620 tgcttggtga gcgccatgaa gtgatccagt tcatgttgga gcacggtgccctgtccatcg 1680 cagccataca agacatcgcc gccttcaaaa tccaagctgt ctacaaagggtacaaggtca 1740 gaaaagcctt ccgagacagg aaaaatctcc tcatgaagca tgaacagttgagaaaagatg 1800 ctgctgccaa aaagcgagag gaagaaaaca aacgaaaaga ggcagaacagcaaaaaggaa 1860 ggcggagccc agattcctgc agaccccagg cccttccctg tctgcctagcacccaggatg 1920 tgcccagcag gcagagccgg gcccccagca agcagcctcc tgctggcaacgtggcccaag 1980 gccctgagcc aagagacagc agaggatctc caggagggtc tctaggcggagccctccaga 2040 aggagcagca tgtttcctca gatttgcagg gaacaaactc cagaaggccaaatgaaacag 2100 ccagagaaca ttctaaaggc caatctgctt gtgtccactt cagacccaatgaaggcagtg 2160 atggaagcag gcatccagga gttccctctg ttgagaagtc cagaggtgagacagctggcg 2220 atgagcggtg tgcaaagggg aaaggtttcg tgaagcagcc ctcctgtatcagggtggctg 2280 ggcctgatga gaaaggagag gactccaggc gggcaggtgc aagccttccaccgcacgata 2340 gccactggaa gcccagcagg cggcatgaca cagaacccaa ggccaaatgtgccccccaga 2400 aaaggcgcac tcaagagctc agaggaggaa ggtgctctcc ggctggttctagccgccctg 2460 gcagtgcccg gggggaggcg gtccatgctg ggcagaatcc tccccaccatcgtacaccaa 2520 gaaacaaagt gacacaagcc aagctcacag gagggctcta ttcacatttgccacagagca 2580 cagaggagtt gaggtcagga gctaggaggc tggagacatc taccctgtccgaggactttc 2640 aggtatctaa ggagactgat ccagcacctg gtcccctctc tgggcagagtgtgaatattg 2700 accttctccc cgtagagctc cgactgcaga taattcagag agaacgaaggaggaaggagc 2760 tgtttcgcaa aaagaacaag gcagcagcag tcatccagcg cgcctggcgaagctaccagc 2820 tcaggaagca cctgtcccac cttcggcata tgaagcagct tggagctggagatgtggaca 2880 gatggaggca agagtctaca gcattgctcc tccaggtttg gaggaaggaactggaactaa 2940 aattccccca aaccactgca gtaagcaagg cccccaagag tccatccaagggcacctcag 3000 gcacaaagtc caccaagcac tcagtgctta agcaaatcta tggttgttctcacgaaggga 3060 aaatacatca tcctacaaga tctgtaaaag cctcttctgt gctgcgtctcaactcagtga 3120 gcaacctaca gtgtatacat ctccttgaga acagtggaag atcaaagaacttttcttata 3180 acctgcaatc agctactcag ccaaaaaaca aaacaaaacc ttgactgcctatggaggaag 3240 actgtgttcg ggggagctgg catagctagt gcagagttca gattttctgctgataatctt 3300 ttacaccttg ggaaaacttt aatatccgta cctgaaggct gattcacctaaaaatgtgtt 3360 aactgaaaga aaatgtcaga atgtttcctt tctgctctta cacagcattgttttgtcaat 3420 caacacagcc tgcactgaaa ggacctgcat agactatgtc tgtgcaaagtgcctgagtgt 3480 ctgctttcac ctcagtctgt acagttggaa atgagaattc ataattaacagcaaaatcta 3540 aggaaaacta aaataaaa 3558 42 1065 PRT Homo sapiens 42 MetAsn Lys Ser Glu Asn Leu Leu Phe Ala Gly Ser Ser Leu Ala Ser 1 5 10 15Gln Val His Ala Ala Ala Val Asn Gly Asp Lys Gly Ala Leu Gln Arg 20 25 30Leu Ile Val Gly Asn Ser Ala Leu Lys Asp Lys Glu Asp Gln Phe Gly 35 40 45Arg Thr Pro Leu Met Tyr Cys Val Leu Ala Asp Arg Leu Asp Cys Ala 50 55 60Asp Ala Leu Leu Lys Ala Gly Ala Asp Val Asn Lys Thr Asp His Ser 65 70 7580 Gln Arg Thr Ala Leu His Leu Ala Ala Gln Lys Gly Asn Tyr Arg Phe 85 9095 Met Lys Leu Leu Leu Thr Arg Arg Ala Asn Trp Met Gln Lys Asp Leu 100105 110 Glu Glu Met Thr Pro Leu His Leu Thr Thr Arg His Arg Ser Pro Lys115 120 125 Cys Leu Ala Leu Leu Leu Lys Phe Met Ala Pro Gly Glu Val AspThr 130 135 140 Gln Asp Lys Asn Lys Gln Thr Ala Leu His Trp Ser Ala TyrTyr Asn 145 150 155 160 Asn Pro Glu His Val Lys Leu Leu Ile Lys His AspSer Asn Ile Gly 165 170 175 Ile Pro Asp Val Glu Gly Lys Ile Pro Leu HisTrp Ala Ala Asn His 180 185 190 Lys Asp Pro Ser Ala Val His Thr Val ArgCys Ile Leu Asp Ala Ala 195 200 205 Pro Thr Glu Ser Leu Leu Asn Trp GlnAsp Tyr Glu Gly Arg Thr Pro 210 215 220 Leu His Phe Ala Val Ala Asp GlyAsn Val Thr Val Val Asp Val Leu 225 230 235 240 Thr Ser Tyr Glu Ser CysAsn Ile Thr Ser Tyr Asp Asn Leu Phe Arg 245 250 255 Thr Pro Leu His TrpAla Ala Leu Leu Gly His Ala Gln Ile Val His 260 265 270 Leu Leu Leu GluArg Asn Lys Ser Gly Thr Ile Pro Ser Asp Ser Gln 275 280 285 Gly Ala ThrPro Leu His Tyr Ala Ala Gln Ser Asn Phe Ala Glu Thr 290 295 300 Val LysVal Phe Leu Lys His Pro Ser Val Lys Asp Asp Ser Asp Leu 305 310 315 320Glu Gly Arg Thr Ser Phe Met Trp Ala Ala Gly Lys Gly Ser Asp Asp 325 330335 Val Leu Arg Thr Met Leu Ser Leu Lys Ser Asp Ile Asp Ile Asn Met 340345 350 Ala Asp Lys Tyr Gly Gly Thr Ala Leu His Ala Ala Ala Leu Ser Gly355 360 365 His Val Ser Thr Val Lys Leu Leu Leu Glu Asn Asn Ala Gln ValAsp 370 375 380 Ala Thr Asp Val Met Lys His Thr Pro Leu Phe Arg Ala CysGlu Met 385 390 395 400 Gly His Lys Asp Val Ile Gln Thr Leu Ile Lys GlyGly Ala Arg Val 405 410 415 Asp Leu Val Asp Gln Asp Gly His Ser Leu LeuHis Trp Ala Ala Leu 420 425 430 Gly Gly Asn Ala Asp Val Cys Gln Ile LeuIle Glu Asn Lys Ile Asn 435 440 445 Pro Asn Val Gln Asp Tyr Ala Gly ArgThr Pro Leu Gln Cys Ala Ala 450 455 460 Tyr Gly Gly Tyr Ile Asn Cys MetAla Val Leu Met Glu Asn Asn Ala 465 470 475 480 Asp Pro Asn Ile Gln AspLys Glu Gly Arg Thr Ala Ser His Trp Ser 485 490 495 Cys Asn Asn Gly TyrLeu Asp Ala Ile Lys Leu Leu Leu Asp Phe Ala 500 505 510 Ala Phe Pro AsnGln Met Glu Asn Asn Glu Glu Arg Tyr Thr Pro Leu 515 520 525 Asp Tyr AlaLeu Leu Gly Glu Arg His Glu Val Ile Gln Phe Met Leu 530 535 540 Glu HisGly Ala Leu Ser Ile Ala Ala Ile Gln Asp Ile Ala Ala Phe 545 550 555 560Lys Ile Gln Ala Val Tyr Lys Gly Tyr Lys Val Arg Lys Ala Phe Arg 565 570575 Asp Arg Lys Asn Leu Leu Met Lys His Glu Gln Leu Arg Lys Asp Ala 580585 590 Ala Ala Lys Lys Arg Glu Glu Glu Asn Lys Arg Lys Glu Ala Glu Gln595 600 605 Gln Lys Gly Arg Arg Ser Pro Asp Ser Cys Arg Pro Gln Ala LeuPro 610 615 620 Cys Leu Pro Ser Thr Gln Asp Val Pro Ser Arg Gln Ser ArgAla Pro 625 630 635 640 Ser Lys Gln Pro Pro Ala Gly Asn Val Ala Gln GlyPro Glu Pro Arg 645 650 655 Asp Ser Arg Gly Ser Pro Gly Gly Ser Leu GlyGly Ala Leu Gln Lys 660 665 670 Glu Gln His Val Ser Ser Asp Leu Gln GlyThr Asn Ser Arg Arg Pro 675 680 685 Asn Glu Thr Ala Arg Glu His Ser LysGly Gln Ser Ala Cys Val His 690 695 700 Phe Arg Pro Asn Glu Gly Ser AspGly Ser Arg His Pro Gly Val Pro 705 710 715 720 Ser Val Glu Lys Ser ArgGly Glu Thr Ala Gly Asp Glu Arg Cys Ala 725 730 735 Lys Gly Lys Gly PheVal Lys Gln Pro Ser Cys Ile Arg Val Ala Gly 740 745 750 Pro Asp Glu LysGly Glu Asp Ser Arg Arg Ala Gly Ala Ser Leu Pro 755 760 765 Pro His AspSer His Trp Lys Pro Ser Arg Arg His Asp Thr Glu Pro 770 775 780 Lys AlaLys Cys Ala Pro Gln Lys Arg Arg Thr Gln Glu Leu Arg Gly 785 790 795 800Gly Arg Cys Ser Pro Ala Gly Ser Ser Arg Pro Gly Ser Ala Arg Gly 805 810815 Glu Ala Val His Ala Gly Gln Asn Pro Pro His His Arg Thr Pro Arg 820825 830 Asn Lys Val Thr Gln Ala Lys Leu Thr Gly Gly Leu Tyr Ser His Leu835 840 845 Pro Gln Ser Thr Glu Glu Leu Arg Ser Gly Ala Arg Arg Leu GluThr 850 855 860 Ser Thr Leu Ser Glu Asp Phe Gln Val Ser Lys Glu Thr AspPro Ala 865 870 875 880 Pro Gly Pro Leu Ser Gly Gln Ser Val Asn Ile AspLeu Leu Pro Val 885 890 895 Glu Leu Arg Leu Gln Ile Ile Gln Arg Glu ArgArg Arg Lys Glu Leu 900 905 910 Phe Arg Lys Lys Asn Lys Ala Ala Ala ValIle Gln Arg Ala Trp Arg 915 920 925 Ser Tyr Gln Leu Arg Lys His Leu SerHis Leu Arg His Met Lys Gln 930 935 940 Leu Gly Ala Gly Asp Val Asp ArgTrp Arg Gln Glu Ser Thr Ala Leu 945 950 955 960 Leu Leu Gln Val Trp ArgLys Glu Leu Glu Leu Lys Phe Pro Gln Thr 965 970 975 Thr Ala Val Ser LysAla Pro Lys Ser Pro Ser Lys Gly Thr Ser Gly 980 985 990 Thr Lys Ser ThrLys His Ser Val Leu Lys Gln Ile Tyr Gly Cys Ser 995 1000 1005 His GluGly Lys Ile His His Pro Thr Arg Ser Val Lys Ala Ser 1010 1015 1020 SerVal Leu Arg Leu Asn Ser Val Ser Asn Leu Gln Cys Ile His 1025 1030 1035Leu Leu Glu Asn Ser Gly Arg Ser Lys Asn Phe Ser Tyr Asn Leu 1040 10451050 Gln Ser Ala Thr Gln Pro Lys Asn Lys Thr Lys Pro 1055 1060 1065 4319 DNA Artificial Sequence Synthetic 43 gtcggacatg caaatcagg 19 44 20DNA Artificial Sequence Synthetic 44 aagccttcag gattgctgtg 20 45 18 DNAArtificial Sequence Synthetic 45 acatggcctg ccagtgac 18 46 20 DNAArtificial Sequence Synthetic 46 acgtgtagga aggcggtctc 20 47 19 DNAArtificial Sequence Synthetic 47 gaggcctcca tgtgctttc 19 48 20 DNAArtificial Sequence Synthetic 48 tgaccctcat tgagaactgc 20 49 20 DNAArtificial Sequence Synthetic 49 ttgtgctctg tctgggagtc 20 50 18 DNAArtificial Sequence Synthetic 50 ctcccccagg gacttctg 18 51 20 DNAArtificial Sequence Synthetic 51 ttctgacagt ggtcgacgtg 20 52 20 DNAArtificial Sequence Synthetic 52 cactgttgat ttcccctctc 20 53 20 DNAArtificial Sequence Synthetic 53 ttcctggttg gatcgttctg 20 54 19 DNAArtificial Sequence Synthetic 54 aggcctgtgg agacctgac 19 55 19 DNAArtificial Sequence Synthetic 55 catgttggga gctttgtgg 19 56 19 DNAArtificial Sequence Synthetic 56 atctgagcac cgttggttg 19 57 18 DNAArtificial Sequence Synthetic 57 ggtttccaca gggaggtg 18 58 20 DNAArtificial Sequence Synthetic 58 accatcccct atgcaaacac 20 59 20 DNAArtificial Sequence Synthetic 59 gaccagagct gaaatctctt 20 60 19 DNAArtificial Sequence Synthetic 60 cacagtggct ttcctgctg 19 61 20 DNAArtificial Sequence Synthetic 61 tgtggtgggt tgatctgttt 20 62 18 DNAArtificial Sequence Synthetic 62 ccctggtgtc tgctcctg 18 63 20 DNAArtificial Sequence Synthetic 63 agcaatagcc ccttgtggag 20 64 20 DNAArtificial Sequence Synthetic 64 tctctcccac tcctctgagc 20 65 20 DNAArtificial Sequence Synthetic 65 tggcagtggt gtctctaagc 20 66 20 DNAArtificial Sequence Synthetic 66 ttggcaacag tggagatacg 20 67 20 DNAArtificial Sequence Synthetic 67 tcttgctgag cacctgtgac 20 68 20 DNAArtificial Sequence Synthetic 68 cactcgctgc gtgtattagt 20 69 18 DNAArtificial Sequence Synthetic 69 ccttgttggc ctctcgtg 18 70 19 DNAArtificial Sequence Synthetic 70 ggaaccaccc atgaccttg 19 71 20 DNAArtificial Sequence Synthetic 71 cagggaatac ttggaggaag 20 72 20 DNAArtificial Sequence Synthetic 72 gcagagaggt tgctggtgag 20 73 18 DNAArtificial Sequence Synthetic 73 aggctctggc caacactg 18 74 22 DNAArtificial Sequence Synthetic 74 catccatctg ttaactggaa gc 22 75 20 DNAArtificial Sequence Synthetic 75 cctggaccca caagtctgag 20 76 23 DNAArtificial Sequence Synthetic 76 gacgagcagt taaaccacca tag 23 77 20 DNAArtificial Sequence Synthetic 77 gctaaaggtg gggaacactc 20 78 20 DNAArtificial Sequence Synthetic 78 gtgccttcaa ggtttcactg 20 79 18 DNAArtificial Sequence Synthetic 79 catcagatgc ggggtctc 18 80 20 DNAArtificial Sequence Synthetic 80 cctgacatgc acaaatgacc 20 81 22 DNAArtificial Sequence Synthetic 81 tgcccactac atttatcctc ac 22 82 24 DNAArtificial Sequence Synthetic 82 gcaaacatat ttgtgaactt ttgc 24 83 23 DNAArtificial Sequence Synthetic 83 cgacgattat cttacaaatg tgg 23 84 20 DNAArtificial Sequence Synthetic 84 ggggacagag ggttttcttg 20 85 20 DNAArtificial Sequence Synthetic 85 gacaggcaca gtgcaaaaac 20 86 20 DNAArtificial Sequence Synthetic 86 gggttcacaa ggtccaacag 20 87 20 DNAArtificial Sequence Synthetic 87 aggtcagaac ctcagcgaag 20 88 21 DNAArtificial Sequence Synthetic 88 gcactggtca ccgtatgatt c 21 89 18 DNAArtificial Sequence Synthetic 89 acgctggaag cgtgactc 18 90 20 DNAArtificial Sequence Synthetic 90 cgagggagcc cacactctac 20 91 20 DNAArtificial Sequence Synthetic 91 cactgacagc accacgaatg 20 92 19 DNAArtificial Sequence Synthetic 92 gaggcaggga aaggatgtg 19 93 18 DNAArtificial Sequence Synthetic 93 tctcgggcag aattcgag 18 94 20 DNAArtificial Sequence Synthetic 94 agggacactg gtggagactg 20 95 20 DNAArtificial Sequence Synthetic 95 aggaggggag agaaggacac 20 96 19 DNAArtificial Sequence Synthetic 96 catgaggcca tctgtcacc 19 97 18 DNAArtificial Sequence Synthetic 97 aggatacccg tggggaag 18 98 20 DNAArtificial Sequence Synthetic 98 caagcccact ttcaatccac 20 99 18 DNAArtificial Sequence Synthetic 99 ccagctgaat gcccactg 18 100 19 DNAArtificial Sequence Synthetic 100 cagtggtccg agtcacagg 19 101 21 DNAArtificial Sequence Synthetic 101 gaggaactcg ctcctaaatg c 21 102 18 DNAArtificial Sequence Synthetic 102 accgggcttg tgctgtag 18

What is claimed is:
 1. A method for detection of a variant nephroretininpolypeptide in a subject, comprising: a) providing a biological samplefrom a subject, wherein said biological sample comprises a nephroretininpolypeptide; and b) detecting the presence or absence of a variantnephroretinin polypeptide in said biological sample.
 2. The method ofclaim 1, wherein said variant nephroretinin polypeptide is a C-terminaltruncation of SEQ ID NO:2.
 3. The method of claim 2, wherein saidvariant nephroretinin polypeptide is selected from the group consistingof SEQ ID NOs: 6, 10, 12, 14, 16, and
 20. 4. The method of claim 1,wherein the presence of said variant nephroretinin polypeptide isindicative of nephronophthisis type 4 kidney disease in said subject. 5.The method of claim 1, wherein said biological sample is selected fromthe group consisting of a blood sample, a tissue sample, a urine sample,and an amniotic fluid sample.
 6. The method of claim 1, wherein saidsubject is selected from the group consisting of an embryo, a fetus, anewborn animal, and a young animal.
 7. The method of claim 6, whereinsaid animal is a human.
 8. The method of claim 1, wherein said detectingcomprises differential antibody binding.
 9. A kit comprising a reagentfor detecting the presence or absence of a variant nephroretininpolypeptide in a biological sample.
 10. The kit of claim 9, furthercomprising instruction for using said kit for said detecting thepresence or absence of a variant nephroretinin polypeptide in abiological sample.
 11. The kit of claim 9, further comprisinginstructions for diagnosing nephronophthisis in said subject based onthe presence or absence of said variant nephroretinin polypeptide. 12.The kit of claim 9, wherein said reagent is one or more antibodies. 13.The kit of claim 12, wherein said antibodies comprise a first antibodythat specifically binds to the C-terminus of said nephroretininpolypeptide and a second antibody that specifically binds to theN-terminus of said nephroretinin polypeptide.
 14. The kit of claim 9,wherein said variant nephroretinin polypeptide is a C-terminaltruncation of SEQ ID NO:2.
 15. The kit of claim 14, wherein said variantnephroretinin polypeptide is selected from the group consisting of SEQID NOs: 6, 10, 12, 14, 16, and
 20. 16. A method for detection of avariant inversin polypeptide in a subject, comprising: a) providing abiological sample from a subject, wherein said biological samplecomprises an inversin polypeptide; and b) detecting the presence orabsence of a variant inversin polypeptide in said biological sample. 17.The method of claim 16, wherein said variant inversin polypeptide is aC-terminal truncation of SEQ ID NO:22.
 18. The method of claim 17,wherein said variant inversin polypeptide is selected from the groupconsisting of SEQ ID NOs: 24, 26, 28, 30, 34, 36, 38 and
 40. 19. Themethod of claim 16, wherein the presence of said variant inversinpolypeptide is indicative of nephronophthisis type 2 kidney disease insaid subject.
 20. The method of claim 16, wherein said detectingcomprises differential antibody binding.